Open nepoyasnit opened 2 months ago
Also, I've checked torch.cuda.memory_allocated() and it was constant, but torch.cuda.memory_cached() was increasing. Maybe someone can explain me, why CUDA cache is growing?
When you free a tensor, the memory is not returned to the GPU immediately. Instead, it is cached by PyTorch to be reused for future allocations. This is why torch.cuda.memory_cached()
can increase over time, even if torch.cuda.memory_allocated()
remains constant.
Also, I've checked torch.cuda.memory_allocated() and it was constant, but torch.cuda.memory_cached() was increasing. Maybe someone can explain me, why CUDA cache is growing?
When you free a tensor, the memory is not returned to the GPU immediately. Instead, it is cached by PyTorch to be reused for future allocations. This is why
torch.cuda.memory_cached()
can increase over time, even iftorch.cuda.memory_allocated()
remains constant.
Yes, but it is strange that with the growth of requests, the size of the enhancer increases so much. Also, i think it's strange to have 13Gb CUDA memory, when only ~3 of that is allocated.
I've run gradio app from repo and saw that CUDA memory grows rapidly from ~3 Gbs to ~12Gbs. If i put small audio, it is also increasing, but not so much.
CUDA memory when i put small (1 second) audio file:
CUDA memory after putting long audio file (2.5 minutes):
Also, I've checked torch.cuda.memory_allocated() and it was constant, but torch.cuda.memory_cached() was increasing. Maybe someone can explain me, why CUDA cache is growing?