Closed omarsiddiqi224 closed 6 months ago
I think the issue is with the --batch-size
you can set it it to a lower value --batch-size 2
should work well for example.
(closing this for now, feel free to re-open if that doesn't fix it).
CUDA out of memory. Tried to allocate 52.00 MiB. GPU 0 has a total capacty of 11.00 GiB of which 4.47 GiB is free. Of the allocated memory 5.22 GiB is allocated by PyTorch, and 257.79 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
I am getting this error
@Vaibhavs10 having the same issue when using --timestamp word
.
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.00 MiB. GPU 0 has a total capacity of 21.96 GiB of which 896.00 KiB is free. Including non-PyTorch memory, this process has 21.95 GiB memory in use. Of the allocated memory 21.14 GiB is allocated by PyTorch, and 591.83 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
Happens for batch sizes > 4. Tried on L4 and T4 GPUs
I set up the program to work with Gradio, here is a snippet of the code:
`def transcribe2(audio_file): if audio_file: head, tail = os.path.split(audio_file) path = head
Once I start it up, and upload an audio file, it works beautifully. however, if I upload a second audio file, it breaks, and gives me the following error:
key_states = torch.cat([past_key_value[0].transpose(1, 2), key_states], dim=1) torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacty of 21.99 GiB of which 15.75 MiB is free. Process 140926 has 14.49 GiB memory in use. Including non-PyTorch memory, this process has 7.49 GiB memory in use. Of the allocated memory 6.80 GiB is allocated by PyTorch, and 460.97 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
If I don't diarize it, I can upload multiple audio files sequentially without any error. However, after around 5 audio files, it breaks. With diarization it breaks on the second upload.