CUDA memory cache increasing

resemble-ai / resemble-enhance

AI powered speech denoising and enhancement

https://huggingface.co/spaces/ResembleAI/resemble-enhance

MIT License

1.27k stars 122 forks source link

CUDA memory cache increasing #37

Open nepoyasnit opened 2 months ago

nepoyasnit commented 2 months ago

I've run gradio app from repo and saw that CUDA memory grows rapidly from ~3 Gbs to ~12Gbs. If i put small audio, it is also increasing, but not so much.

CUDA memory when i put small (1 second) audio file:

CUDA memory after putting long audio file (2.5 minutes):

Also, I've checked torch.cuda.memory_allocated() and it was constant, but torch.cuda.memory_cached() was increasing. Maybe someone can explain me, why CUDA cache is growing?

Cooperos commented 2 months ago

Also, I've checked torch.cuda.memory_allocated() and it was constant, but torch.cuda.memory_cached() was increasing. Maybe someone can explain me, why CUDA cache is growing?

When you free a tensor, the memory is not returned to the GPU immediately. Instead, it is cached by PyTorch to be reused for future allocations. This is why torch.cuda.memory_cached() can increase over time, even if torch.cuda.memory_allocated() remains constant.

Pytorch docs

nepoyasnit commented 1 month ago

Also, I've checked torch.cuda.memory_allocated() and it was constant, but torch.cuda.memory_cached() was increasing. Maybe someone can explain me, why CUDA cache is growing?

When you free a tensor, the memory is not returned to the GPU immediately. Instead, it is cached by PyTorch to be reused for future allocations. This is why torch.cuda.memory_cached() can increase over time, even if torch.cuda.memory_allocated() remains constant.

Pytorch docs

Yes, but it is strange that with the growth of requests, the size of the enhancer increases so much. Also, i think it's strange to have 13Gb CUDA memory, when only ~3 of that is allocated.