torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 312.00 MiB. GPU

neohunter commented 2 months ago

I'm having this error when trying to process

❯ docker compose up
[+] Running 1/1
 ✔ Container scraibe_large  Recreated                                                                                                           0.4s
Attaching to scraibe_large
scraibe_large  | /opt/conda/lib/python3.10/site-packages/pyannote/audio/core/io.py:43: UserWarning: torchaudio._backend.set_audio_backend has been deprecated. With dispatcher enabled, this function is no-op. You can remove the function call.
scraibe_large  |   torchaudio.set_audio_backend("soundfile")
scraibe_large  | Traceback (most recent call last):
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/gradio/queueing.py", line 536, in process_events
scraibe_large  |     response = await route_utils.call_process_api(
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/gradio/route_utils.py", line 321, in call_process_api
scraibe_large  |     output = await app.get_blocks().process_api(
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/gradio/blocks.py", line 1935, in process_api
scraibe_large  |     result = await self.call_function(
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/gradio/blocks.py", line 1520, in call_function
scraibe_large  |     prediction = await anyio.to_thread.run_sync(  # type: ignore
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
scraibe_large  |     return await get_async_backend().run_sync_in_worker_thread(
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2177, in run_sync_in_worker_thread
scraibe_large  |     return await future
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 859, in run
scraibe_large  |     result = context.run(func, *args)
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/gradio/utils.py", line 826, in wrapper
scraibe_large  |     response = f(*args, **kwargs)
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/gradio/utils.py", line 826, in wrapper
scraibe_large  |     response = f(*args, **kwargs)
scraibe_large  |   File "/app/scraibe_webui/utils/interactions.py", line 126, in run_scraibe
scraibe_large  |     res, out_str , out_json = _pipe.autotranscribe(source = source,
scraibe_large  |   File "/app/scraibe_webui/utils/wrapper.py", line 71, in autotranscribe
scraibe_large  |     result = self.model.autotranscribe(source, **_kwargs)
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/scraibe/autotranscript.py", line 151, in autotranscribe
scraibe_large  |     diarisation = self.diariser.diarization(dia_audio, **kwargs)
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/scraibe/diarisation.py", line 84, in diarization
scraibe_large  |     diarization = self.model(audiofile, *args, **kwargs)
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/pyannote/audio/core/pipeline.py", line 325, in __call__
scraibe_large  |     return self.apply(file, **kwargs)
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/pyannote/audio/pipelines/speaker_diarization.py", line 514, in apply
scraibe_large  |     embeddings = self.get_embeddings(
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/pyannote/audio/pipelines/speaker_diarization.py", line 349, in get_embeddings
scraibe_large  |     embedding_batch: np.ndarray = self._embedding(
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/pyannote/audio/pipelines/speaker_verification.py", line 706, in __call__
scraibe_large  |     embeddings = self.model_(
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
scraibe_large  |     return self._call_impl(*args, **kwargs)
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
scraibe_large  |     return forward_call(*args, **kwargs)
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/pyannote/audio/models/embedding/wespeaker/__init__.py", line 112, in forward
scraibe_large  |     return self.resnet(fbank, weights=weights)[1]
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
scraibe_large  |     return self._call_impl(*args, **kwargs)
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
scraibe_large  |     return forward_call(*args, **kwargs)
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/pyannote/audio/models/embedding/wespeaker/resnet.py", line 211, in forward
scraibe_large  |     out = self.layer1(out)
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
scraibe_large  |     return self._call_impl(*args, **kwargs)
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
scraibe_large  |     return forward_call(*args, **kwargs)
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/container.py", line 217, in forward
scraibe_large  |     input = module(input)
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
scraibe_large  |     return self._call_impl(*args, **kwargs)
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
scraibe_large  |     return forward_call(*args, **kwargs)
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/pyannote/audio/models/embedding/wespeaker/resnet.py", line 102, in forward
scraibe_large  |     out = self.bn2(self.conv2(out))
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
scraibe_large  |     return self._call_impl(*args, **kwargs)
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
scraibe_large  |     return forward_call(*args, **kwargs)
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/batchnorm.py", line 175, in forward
scraibe_large  |     return F.batch_norm(
scraibe_large  |   File "/opt/conda/lib/python3.10/site-packages/torch/nn/functional.py", line 2509, in batch_norm
scraibe_large  |     return torch.batch_norm(
scraibe_large  | torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 312.00 MiB. GPU
^CGracefully stopping... (press Ctrl+C again to force)

I'm not sure if this should be part of ScAIbe or this repo, I think the solution is to reduce the batch size? I tried also with other smaller models and same error.

Thu Aug 29 10:54:33 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01             Driver Version: 535.183.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce GTX 1050        Off | 00000000:01:00.0  On |                  N/A |
| 35%   24C    P8              N/A /  75W |     63MiB /  2048MiB |     28%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      7024      G   /usr/lib/xorg/Xorg                           61MiB |
+---------------------------------------------------------------------------------------+

JSchmie commented 2 months ago

Hello @neohunter,

Given that you're working with a GTX 1050, your options for Whisper models are quite limited. I recommend reviewing the official Whisper repository to see which models are compatible with your GPU. Considering that Pyannote also requires some GPU memory, I suggest using either the base or tiny models. If you're using Faster Whisper, you might be able to fit the small model, but please be aware of the recent issue #27. Additionally, if you're using OpenAI's Whisper backend, note that they do not support batching.

JSchmie commented 2 months ago

Does it solve your problem @neohunter ?

JSchmie / ScrAIbe-WebUI

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 312.00 MiB. GPU #31