abdeladim-s / subsai

🎞️ Subtitles generation tool (Web-UI + CLI + Python package) powered by OpenAI's Whisper and its variants 🎞️
https://abdeladim-s.github.io/subsai/
GNU General Public License v3.0
1.15k stars 96 forks source link

RuntimeError: cuFFT error: CUFFT_INTERNAL_ERROR #131

Closed hirowa closed 2 months ago

hirowa commented 2 months ago

Hi, I'm getting the following error when trying to run WhisperX Large v3 on the WebUI and a 4090. Trying to run it with speaker labels enabled. I've accepted all the HF permissions. Everything runs fine, until I enable that option.

Installed in Win11 Docker.

Hope someone can help me.

2024-04-23 18:29:54 2024-04-24 00:29:54.039 Created a temporary directory at /tmp/tmp2p1vvkm1
2024-04-23 18:29:54 2024-04-24 00:29:54.039 Writing /tmp/tmp2p1vvkm1/_remote_module_non_scriptable.py
2024-04-23 18:29:54 Model was trained with pyannote.audio 0.0.1, yours is 3.1.1. Bad things might happen unless you revert pyannote.audio to 0.x.
2024-04-23 18:29:54 Model was trained with torch 1.10.0+cu102, yours is 2.0.1. Bad things might happen unless you revert torch to 1.x.
2024-04-23 18:29:55 Detected language: es (0.94) in first 30s of audio...
2024-04-23 18:29:58 2024-04-24 00:29:58.115 Uncaught app exception
2024-04-23 18:29:58 Traceback (most recent call last):
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/streamlit/runtime/caching/storage/in_memory_cache_storage_wrapper.py", line 87, in get
2024-04-23 18:29:58     entry_bytes = self._read_from_mem_cache(key)
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/streamlit/runtime/caching/storage/in_memory_cache_storage_wrapper.py", line 137, in _read_from_mem_cache
2024-04-23 18:29:58     raise CacheStorageKeyNotFoundError("Key not found in mem cache")
2024-04-23 18:29:58 streamlit.runtime.caching.storage.cache_storage_protocol.CacheStorageKeyNotFoundError: Key not found in mem cache
2024-04-23 18:29:58 
2024-04-23 18:29:58 During handling of the above exception, another exception occurred:
2024-04-23 18:29:58 
2024-04-23 18:29:58 Traceback (most recent call last):
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/streamlit/runtime/caching/cache_data_api.py", line 588, in read_result
2024-04-23 18:29:58     pickled_entry = self.storage.get(key)
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/streamlit/runtime/caching/storage/in_memory_cache_storage_wrapper.py", line 89, in get
2024-04-23 18:29:58     entry_bytes = self._persist_storage.get(key)
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/streamlit/runtime/caching/storage/local_disk_cache_storage.py", line 133, in get
2024-04-23 18:29:58     raise CacheStorageKeyNotFoundError(
2024-04-23 18:29:58 streamlit.runtime.caching.storage.cache_storage_protocol.CacheStorageKeyNotFoundError: Local disk cache storage is disabled (persist=None)
2024-04-23 18:29:58 
2024-04-23 18:29:58 The above exception was the direct cause of the following exception:
2024-04-23 18:29:58 
2024-04-23 18:29:58 Traceback (most recent call last):
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/streamlit/runtime/caching/cache_utils.py", line 245, in _get_or_create_cached_value
2024-04-23 18:29:58     cached_result = cache.read_result(value_key)
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/streamlit/runtime/caching/cache_data_api.py", line 590, in read_result
2024-04-23 18:29:58     raise CacheKeyNotFoundError(str(e)) from e
2024-04-23 18:29:58 streamlit.runtime.caching.cache_errors.CacheKeyNotFoundError: Local disk cache storage is disabled (persist=None)
2024-04-23 18:29:58 
2024-04-23 18:29:58 During handling of the above exception, another exception occurred:
2024-04-23 18:29:58 
2024-04-23 18:29:58 Traceback (most recent call last):
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/streamlit/runtime/caching/storage/in_memory_cache_storage_wrapper.py", line 87, in get
2024-04-23 18:29:58     entry_bytes = self._read_from_mem_cache(key)
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/streamlit/runtime/caching/storage/in_memory_cache_storage_wrapper.py", line 137, in _read_from_mem_cache
2024-04-23 18:29:58     raise CacheStorageKeyNotFoundError("Key not found in mem cache")
2024-04-23 18:29:58 streamlit.runtime.caching.storage.cache_storage_protocol.CacheStorageKeyNotFoundError: Key not found in mem cache
2024-04-23 18:29:58 
2024-04-23 18:29:58 During handling of the above exception, another exception occurred:
2024-04-23 18:29:58 
2024-04-23 18:29:58 Traceback (most recent call last):
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/streamlit/runtime/caching/cache_data_api.py", line 588, in read_result
2024-04-23 18:29:58     pickled_entry = self.storage.get(key)
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/streamlit/runtime/caching/storage/in_memory_cache_storage_wrapper.py", line 89, in get
2024-04-23 18:29:58     entry_bytes = self._persist_storage.get(key)
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/streamlit/runtime/caching/storage/local_disk_cache_storage.py", line 133, in get
2024-04-23 18:29:58     raise CacheStorageKeyNotFoundError(
2024-04-23 18:29:58 streamlit.runtime.caching.storage.cache_storage_protocol.CacheStorageKeyNotFoundError: Local disk cache storage is disabled (persist=None)
2024-04-23 18:29:58 
2024-04-23 18:29:58 The above exception was the direct cause of the following exception:
2024-04-23 18:29:58 
2024-04-23 18:29:58 Traceback (most recent call last):
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/streamlit/runtime/caching/cache_utils.py", line 293, in _handle_cache_miss
2024-04-23 18:29:58     cached_result = cache.read_result(value_key)
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/streamlit/runtime/caching/cache_data_api.py", line 590, in read_result
2024-04-23 18:29:58     raise CacheKeyNotFoundError(str(e)) from e
2024-04-23 18:29:58 streamlit.runtime.caching.cache_errors.CacheKeyNotFoundError: Local disk cache storage is disabled (persist=None)
2024-04-23 18:29:58 
2024-04-23 18:29:58 During handling of the above exception, another exception occurred:
2024-04-23 18:29:58 
2024-04-23 18:29:58 Traceback (most recent call last):
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script
2024-04-23 18:29:58     exec(code, module.__dict__)
2024-04-23 18:29:58   File "/subsai/src/subsai/webui.py", line 573, in <module>
2024-04-23 18:29:58     run()
2024-04-23 18:29:58   File "/subsai/src/subsai/webui.py", line 566, in run
2024-04-23 18:29:58     webui()
2024-04-23 18:29:58   File "/subsai/src/subsai/webui.py", line 330, in webui
2024-04-23 18:29:58     subs = _transcribe(file_path, stt_model_name, model_config)
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/streamlit/runtime/caching/cache_utils.py", line 194, in wrapper
2024-04-23 18:29:58     return cached_func(*args, **kwargs)
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/streamlit/runtime/caching/cache_utils.py", line 223, in __call__
2024-04-23 18:29:58     return self._get_or_create_cached_value(args, kwargs)
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/streamlit/runtime/caching/cache_utils.py", line 248, in _get_or_create_cached_value
2024-04-23 18:29:58     return self._handle_cache_miss(cache, value_key, func_args, func_kwargs)
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/streamlit/runtime/caching/cache_utils.py", line 302, in _handle_cache_miss
2024-04-23 18:29:58     computed_value = self._info.func(*func_args, **func_kwargs)
2024-04-23 18:29:58   File "/subsai/src/subsai/webui.py", line 191, in _transcribe
2024-04-23 18:29:58     subs = subs_ai.transcribe(media_file=file_path, model=model)
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/subsai/main.py", line 115, in transcribe
2024-04-23 18:29:58     return stt_model.transcribe(media_file)
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/subsai/models/whisperX_model.py", line 139, in transcribe
2024-04-23 18:29:58     diarize_segments = diarize_model(audio, min_speakers=self.min_speakers, max_speakers=self.max_speakers)
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/whisperx/diarize.py", line 28, in __call__
2024-04-23 18:29:58     segments = self.model(audio_data, num_speakers = num_speakers, min_speakers=min_speakers, max_speakers=max_speakers)
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/pyannote/audio/core/pipeline.py", line 325, in __call__
2024-04-23 18:29:58     return self.apply(file, **kwargs)
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/pyannote/audio/pipelines/speaker_diarization.py", line 514, in apply
2024-04-23 18:29:58     embeddings = self.get_embeddings(
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/pyannote/audio/pipelines/speaker_diarization.py", line 349, in get_embeddings
2024-04-23 18:29:58     embedding_batch: np.ndarray = self._embedding(
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/pyannote/audio/pipelines/speaker_verification.py", line 706, in __call__
2024-04-23 18:29:58     embeddings = self.model_(
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
2024-04-23 18:29:58     return forward_call(*args, **kwargs)
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/pyannote/audio/models/embedding/wespeaker/__init__.py", line 111, in forward
2024-04-23 18:29:58     fbank = self.compute_fbank(waveforms)
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/pyannote/audio/models/embedding/wespeaker/__init__.py", line 94, in compute_fbank
2024-04-23 18:29:58     features = torch.vmap(self._fbank)(waveforms.to(fft_device)).to(device)
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/torch/_functorch/vmap.py", line 434, in wrapped
2024-04-23 18:29:58     return _flat_vmap(
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/torch/_functorch/vmap.py", line 39, in fn
2024-04-23 18:29:58     return f(*args, **kwargs)
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/torch/_functorch/vmap.py", line 619, in _flat_vmap
2024-04-23 18:29:58     batched_outputs = func(*batched_inputs, **kwargs)
2024-04-23 18:29:58   File "/opt/conda/lib/python3.10/site-packages/torchaudio/compliance/kaldi.py", line 616, in fbank
2024-04-23 18:29:58     spectrum = torch.fft.rfft(strided_input).abs()
2024-04-23 18:29:58 RuntimeError: cuFFT error: CUFFT_INTERNAL_ERROR

image

abdeladim-s commented 2 months ago

Hi @hirowa,

WhisperX known to have so many issues. But it seems that this issue is related to the 4090 GPU and cuda, did you check this: bug-ubuntu-on-wsl2-rtx4090-related-cufft-runtime-error Can you please try the WhisperX package separately and see if it works ?

hirowa commented 2 months ago

I've tried multiple fixes without avail :( Nor WhisperX or Subsai run if I install it via CLI Subsai works on Docker (no diarize), but I'm not that code savvy, so I don't know what's happening.

abdeladim-s commented 2 months ago

Yes in your case it is better to install the project manually to try different settings. But if Docker works and based on the article I sent you above and I can give some suggestion if you want to try ?

hirowa commented 2 months ago

Suggestions are always welcome! Thank you :)

abdeladim-s commented 2 months ago

OK. I think the first thing is to try a different version of cuda, to do this:

  1. Clone the repo
    1. Go to to the docker file and change
      FROM pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime

      to another version from the docker hub, maybe try 11.8 :

      pytorch/pytorch:2.1.0-cuda11.8-cudnn8-runtime

      and then rebuild the docker file as you did before.

And see how if that will solve the issue ?

hirowa commented 2 months ago

Hey! I applied this and it worked flawlessly. Thank you very much! Amazing work :)