Closed kanjieater closed 1 year ago
It seems that it does not happen every time I run the same file unfortunately.
You can try the new commit to see if it prevents this error.
I experience the same issue. It happens with some specific finetuned models, but not all. I will also try the new commit and report back.
I used the latest commit and now has another error
whisper_model.transcribe(filename, language="zh", regroup=True, demucs=True, vad=True)
/usr/local/lib/python3.9/dist-packages/stable_whisper/timing.py in _split_tokens(tokens, tokenizer) 134 curr_tokens = [] 135 --> 136 assert len(text) == 0 137 138 return words, word_tokens
AssertionError:
why do we need to have this line?
assert len(text) == 0
why do we need to have this line?
assert len(text) == 0
The line serves to ensure that every word/character has been paired with the tokens that makes it up. If there is still text left at the end of the pairing process that should mean there was mismatch in previous pairs.
Can you transcribe the same audio with word_timestamps=False
then save the result as a json and share it?
Alternately, you can try to install the previous version of whisper to see if you can replicate this error.
pip install openai-whisper==20230308
if you can't replicate the error with this older version of whisper, then it's likely an issue with the new tokenizer that is used in the newer version.
It seems that the latest 2.1 fixed most of the issues I was dealing with in quality of the transcription's timestamps. Unfortunately after running a few tests, I ran into this error. I'll see if I can reproduce it consistently.