abdeladim-s / subsai

🎞️ Subtitles generation tool (Web-UI + CLI + Python package) powered by OpenAI's Whisper and its variants 🎞️
https://abdeladim-s.github.io/subsai/
GNU General Public License v3.0
1.15k stars 96 forks source link

Error running whisperX #110

Open Theking1313 opened 4 months ago

Theking1313 commented 4 months ago

I was trying to use "m-bain/whisperX" on large-V3 model with speaker labels. but get these errors:

Other models work well without issue such as faster-whisper or whisper.cpp.

Whisper-timestamped seems to have no cuda option despite mention it can run on gpu in git page.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\Username\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\caching\storage\in_memory_cache_storage_wrapper.py", line 87, in get
    entry_bytes = self._read_from_mem_cache(key)
  File "C:\Users\Username\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\caching\storage\in_memory_cache_storage_wrapper.py", line 137, in _read_from_mem_cache
    raise CacheStorageKeyNotFoundError("Key not found in mem cache")
streamlit.runtime.caching.storage.cache_storage_protocol.CacheStorageKeyNotFoundError: Key not found in mem cache

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\Username\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\caching\cache_data_api.py", line 588, in read_result
    pickled_entry = self.storage.get(key)
  File "C:\Users\Username\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\caching\storage\in_memory_cache_storage_wrapper.py", line 89, in get
    entry_bytes = self._persist_storage.get(key)
  File "C:\Users\Username\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\caching\storage\local_disk_cache_storage.py", line 133, in get
    raise CacheStorageKeyNotFoundError(
streamlit.runtime.caching.storage.cache_storage_protocol.CacheStorageKeyNotFoundError: Local disk cache storage is disabled (persist=None)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\Username\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 293, in _handle_cache_miss
    cached_result = cache.read_result(value_key)
  File "C:\Users\Username\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\caching\cache_data_api.py", line 590, in read_result
    raise CacheKeyNotFoundError(str(e)) from e
streamlit.runtime.caching.cache_errors.CacheKeyNotFoundError: Local disk cache storage is disabled (persist=None)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\Username\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 565, in _run_script
    exec(code, module.__dict__)
  File "C:\Users\Username\AppData\Local\Programs\Python\Python310\Lib\site-packages\subsai\webui.py", line 555, in <module>
    run()
  File "C:\Users\Username\AppData\Local\Programs\Python\Python310\Lib\site-packages\subsai\webui.py", line 548, in run
    webui()
  File "C:\Users\Username\AppData\Local\Programs\Python\Python310\Lib\site-packages\subsai\webui.py", line 318, in webui
    subs = _transcribe(file_path, stt_model_name, model_config)
  File "C:\Users\Username\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 194, in wrapper
    return cached_func(*args, **kwargs)
  File "C:\Users\Username\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 223, in __call__
    return self._get_or_create_cached_value(args, kwargs)
  File "C:\Users\Username\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 248, in _get_or_create_cached_value
    return self._handle_cache_miss(cache, value_key, func_args, func_kwargs)
  File "C:\Users\Username\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 302, in _handle_cache_miss
    computed_value = self._info.func(*func_args, **func_kwargs)
  File "C:\Users\Username\AppData\Local\Programs\Python\Python310\Lib\site-packages\subsai\webui.py", line 189, in _transcribe
    model = subs_ai.create_model(model_name, model_config=model_config)
  File "C:\Users\Username\AppData\Local\Programs\Python\Python310\lib\site-packages\subsai\main.py", line 96, in create_model
    return AVAILABLE_MODELS[model_name]['class'](model_config)
  File "C:\Users\Username\AppData\Local\Programs\Python\Python310\lib\site-packages\subsai\models\whisperX_model.py", line 123, in __init__
    self.model = whisperx.load_model(self.model_type,
  File "C:\Users\Username\AppData\Local\Programs\Python\Python310\lib\site-packages\whisperx\asr.py", line 92, in load_model
    default_asr_options = faster_whisper.transcribe.TranscriptionOptions(**default_asr_options)
    default_asr_options = faster_whisper.transcribe.TranscriptionOptions(**default_asr_options)
abdeladim-s commented 4 months ago

Try smaller models and if they work then this is probably a WhsiperX issue, go to their repo and see if they support large-v3.

For Whisper-timestamped, you should see the cuda option if you have a cuda device. If there is no cuda option despite having a cuda device then you should check if Pytorch GPU is installed correctly and can see your cuda device by running

import torch 

print(torch.cuda.is_available())
Theking1313 commented 4 months ago

Turns out I was missing cuda, so managed to fix that,

Still working on the other issue.

reyaz006 commented 3 months ago

I'm getting this error with large-v2:

TypeError: TranscriptionOptions.__new__() missing 6 required positional arguments: 'repetition_penalty', 'no_repeat_ngram_size', 'prompt_reset_on_temperature', 'max_new_tokens', 'clip_timestamps', and 'hallucination_silence_threshold'