m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
BSD 2-Clause "Simplified" License
12.67k stars 1.35k forks source link

Bug with load_model - need to adapt to faster-whisper 0.8.0 #457

Closed dwidoo closed 1 year ago

dwidoo commented 1 year ago

https://github.com/guillaumekln/faster-whisper/issues/455#issuecomment-1705249830

``/usr/local/lib/python3.10/dist-packages/whisperx/asr.py in load_model(whisper_arch, device, device_index, compute_type, asr_options, language, vad_options, model, task, download_root) 87 del default_asr_options["suppress_numerals"] 88 ---> 89 default_asr_options = faster_whisper.transcribe.TranscriptionOptions(**default_asr_options) 90 91 default_vad_options = {

TypeError: TranscriptionOptions.new() missing 3 required positional arguments: 'repetition_penalty', 'no_repeat_ngram_size', and 'prompt_reset_on_temperature'``

Bonk1971 commented 1 year ago

I am getting this too.

remic33 commented 1 year ago

same here, soft down

Bonk1971 commented 1 year ago

looks like faster whisper changed the default asr settings. Just adding this to your asr options when loading the model will fix this until the repo is fixed. "repetition_penalty": 1, "no_repeat_ngram_size": 0, "prompt_reset_on_temperature": 0.5

remic33 commented 1 year ago

Should work with default_asr_options = { "beam_size": 5, "best_of": 5, "patience": 1, "length_penalty": 1, "repetition_penalty": 1, "no_repeat_ngram_size": 0, "temperatures": [0.0, 0.2, 0.4, 0.6, 0.8, 1.0], "compression_ratio_threshold": 2.4, "log_prob_threshold": -1.0, "no_speech_threshold": 0.6, "condition_on_previous_text": False, "prompt_reset_on_temperature": 0.5, "initial_prompt": None, "prefix": None, "suppress_blank": True, "suppress_tokens": [-1], "without_timestamps": True, "max_initial_timestamp": 0.0, "word_timestamps": False, "prepend_punctuations": "\"'“¿([{-", "append_punctuations": "\"'.。,,!!??::”)]}、", "suppress_numerals": False, } on whisperx/asr.py