abdeladim-s / subsai

🎞️ Subtitles generation tool (Web-UI + CLI + Python package) powered by OpenAI's Whisper and its variants 🎞️
https://abdeladim-s.github.io/subsai/
GNU General Public License v3.0
1.16k stars 96 forks source link

whisperX results in "unexpected keyword argument error for 3 fields #70

Open Camology opened 9 months ago

Camology commented 9 months ago

Command I input to windows powershell:

subsai .\file --model m-bain/whisperX --model-configs '{\"model_type\": \"large-v2\", \"device\": \"cuda\"}' --format srt

And result is:

TypeError: <lambda>() got an unexpected keyword argument 'repetition_penalty' Reading the source code for whisperX there is a repetition_penalty field that isn't present in your code:

https://github.com/m-bain/whisperX/blob/b1a98b78c9152ace9f9801593b5fa0c7d5d96b0f/whisperx/asr.py#L66

default_asr_options = { "beam_size": 5, "best_of": 5, "patience": 1, "length_penalty": 1, "repetition_penalty": 1, "no_repeat_ngram_size": 0, "temperatures": [0.0, 0.2, 0.4, 0.6, 0.8, 1.0], "compression_ratio_threshold": 2.4, "log_prob_threshold": -1.0, "no_speech_threshold": 0.6, "condition_on_previous_text": False, "prompt_reset_on_temperature": 0.5, "initial_prompt": None, "prefix": None, "suppress_blank": True, "suppress_tokens": [-1], "without_timestamps": True, "max_initial_timestamp": 0.0, "word_timestamps": False, "prepend_punctuations": "\"'“¿([{-", "append_punctuations": "\"'.。,,!!??::”)]}、", "suppress_numerals": False, }

But you don't have all of these as is, so may not be required

Camology commented 9 months ago

Workaround for now:

find your whisperX install via pip show whisperX

In that folder open up asr.py and comment out the following 3 lines:

under default_asr_options:

default_asr_options = { "beam_size": 5, "best_of": 5, "patience": 1, "length_penalty": 1,

"repetition_penalty": 1,

    # "no_repeat_ngram_size": 0,
    "temperatures": [0.0, 0.2, 0.4, 0.6, 0.8, 1.0],
    "compression_ratio_threshold": 2.4,
    "log_prob_threshold": -1.0,
    "no_speech_threshold": 0.6,
    "condition_on_previous_text": False,
    # "prompt_reset_on_temperature": 0.5,
    "initial_prompt": None,
    "prefix": None,
    "suppress_blank": True,
    "suppress_tokens": [-1],
    "without_timestamps": True,
    "max_initial_timestamp": 0.0,
    "word_timestamps": False,
    "prepend_punctuations": "\"'“¿([{-",
    "append_punctuations": "\"'.。,,!!??::”)]}、",
    "suppress_numerals": False,
}
abdeladim-s commented 9 months ago

Thanks @Camology for reporting the issue. I think there issue is caused by the latest commits in the whisperX repo. I updated the requirements.txt file to have the commit hash of the version working on my machine.

ptbsare commented 8 months ago

commit 119c6da4c2e39f3e09cbdc9660a7af96f7d7f692 made this not work. the reason is that repetition_penalty、no_repeat_ngram_size、prompt_reset_on_temperature were added to fast_whisper v0.8 See https://github.com/guillaumekln/faster-whisper/blob/ad388cd394d43c0c13a0dde4577dd611a980c679/faster_whisper/transcribe.py#L45 to make whisperx work, one should stick fast_whisper to version 0.5.1 (requirements.txt)

abdeladim-s commented 8 months ago

Thanks @ptbsare for pointing that out. I think you are right, I will roll back faster-whisper to v0.5.1

nexuslux commented 5 months ago

Did this get resolved? as I still have the same error now:

"These arguments are repetition_penalty, no_repeat_ngram_size, and prompt_reset_on_temperature. Refer to the documentation or examples of the whisperx package to see how these arguments should be set."

abdeladim-s commented 5 months ago

@nexuslux, could you please try now? I've updated it to the latest release and it works on my end.

nexuslux commented 5 months ago

Still having an error - did a fresh install. Wonder why that is?

I'm using WhisperX with the diariziation. I have my hugging face token in there -- no other settings were touched.

I tried WhisperX without the diarization and it worked perfectly.

Screenshot 2024-01-18 at 22 19 25
abdeladim-s commented 5 months ago

@nexuslux, I think you didn't accept the agreement of the pyannote models on HF, take a look at the console and you should see the link for that.

nexuslux commented 5 months ago

ok got it! Just in case anyone comes across this later: Link 1 :https://github.com/pyannote/pyannote-audio?tab=readme-ov-file Link 2: https://huggingface.co/pyannote/speaker-diarization-3.1 Link 3: https://huggingface.co/pyannote/speaker-diarization

I guess the issue now is that things hang and don't complete...

2024-01-19 10:31:31.156 Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.1.3. To apply the upgrade to your files permanently, runpython -m pytorch_lightning.utilities.upgrade_checkpoint .cache/torch/whisperx-vad-segmentation.bin Model was trained with pyannote.audio 0.0.1, yours is 3.1.1. Bad things might happen unless you revert pyannote.audio to 0.x. Model was trained with torch 1.10.0+cu102, yours is 2.1.2. Bad things might happen unless you revert torch to 1.x.

abdeladim-s commented 5 months ago

ok got it! Just in case anyone comes across this later: Link 1 :pyannote/pyannote-audio Link 2: huggingface.co/pyannote/speaker-diarization-3.1 Link 3: huggingface.co/pyannote/speaker-diarization

I guess the issue now is that things hang and don't complete...

2024-01-19 10:31:31.156 Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.1.3. To apply the upgrade to your files permanently, runpython -m pytorch_lightning.utilities.upgrade_checkpoint .cache/torch/whisperx-vad-segmentation.binModel was trained with pyannote.audio 0.0.1, yours is 3.1.1. Bad things might happen unless you revert pyannote.audio to 0.x. Model was trained with torch 1.10.0+cu102, yours is 2.1.2. Bad things might happen unless you revert torch to 1.x.

Still have some issue ? If you provide HF token and accept the agreement of pyannote the models should get downloaded without any problems, at least on my end.

nexuslux commented 5 months ago

Let me re-install subsai and see how it goes. Btw, really appreciative of your fast replies and just how comprehensive this repo is. Amazing work.

abdeladim-s commented 5 months ago

You are welcome @nexuslux. Feel free to le me know if you find any problems.

Letiliel commented 2 months ago

ok got it! Just in case anyone comes across this later: Link 1 :https://github.com/pyannote/pyannote-audio?tab=readme-ov-file Link 2: https://huggingface.co/pyannote/speaker-diarization-3.1 Link 3: https://huggingface.co/pyannote/speaker-diarization

I guess the issue now is that things hang and don't complete...

2024-01-19 10:31:31.156 Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.1.3. To apply the upgrade to your files permanently, runpython -m pytorch_lightning.utilities.upgrade_checkpoint .cache/torch/whisperx-vad-segmentation.binModel was trained with pyannote.audio 0.0.1, yours is 3.1.1. Bad things might happen unless you revert pyannote.audio to 0.x. Model was trained with torch 1.10.0+cu102, yours is 2.1.2. Bad things might happen unless you revert torch to 1.x.

Getting the same issue