Closed rairavi closed 1 year ago
Thanks for reporting. I would need a way to reproduce this in order to investigate... (like having the audio file, and the exact set of options that are used, also knowing whether it runs on GPU or CPU device)
Can you at least give the whisper version you use?
whisper_timestamped --versions
whisper_timestamped --versions 1.12.17 -- Whisper 20230314 in /Users/python/venv-3.10/lib/python3.10/site-packages/whisper
CPU - Macbook M2 on large-ve, only timestamp=true option rest is defualt
https://www.youtube.com/watch?v=rn64Vf6GEoo
On 02-May-2023, at 10:13 PM, Jérôme Louradour @.***> wrote:
Thanks for reporting. I would need a way to reproduce this in order to investigate... (like having the audio file, and the exact set of options that are used, also knowing whether it runs on GPU or CPU device)
Can you at least give the whisper version you use?
whisper_timestamped --versions — Reply to this email directly, view it on GitHub https://github.com/linto-ai/whisper-timestamped/issues/87#issuecomment-1531804228, or unsubscribe https://github.com/notifications/unsubscribe-auth/A7BCLHV6HVDNDEP4F4DVXJTXEE2UXANCNFSM6AAAAAAXOY4IFA. You are receiving this because you authored the thread.
Thank you @rairavi
I've just disabled a dangerous heuristic that was causing this issue, and also possible removing relevant words from the transcriptions. (it consisted into removing words with empty duration, assuming it was coming from Whisper's hallucinations).
So in a sense that issue is solved
97%|█████████▋| 157296/162500 [40:30<01:17, 67.37frames/s] 99%|█████████▊| 160296/162500 [40:55<00:32, 67.39frames/s] 99%|█████████▊| 160296/162500 [41:10<00:32, 67.39frames/s] 100%|██████████| 162500/162500 [41:40<00:00, 61.43frames/s] 100%|██████████| 162500/162500 [41:40<00:00, 64.98frames/s] Got inconsistent length for segment 48 (49 != 19). Some words have been ignored. Traceback (most recent call last): File "/data/p/code.py", line 84, in
result = transcribe(video_converted,language)
File "/data/p/codeTranscript.py", line 15, in transcribe
return transcribe_timestamped(audio,language)
File "/data/p/codeTranscript.py", line 33, in transcribe_timestamped
result = whisper_timestamped.transcribe(model, audio, language, fp16=False, verbose=False)
File "/data/p/venv-3.10/lib/python3.10/site-packages/whisper_timestamped/transcribe.py", line 264, in transcribe_timestamped
transcription, words = remove_last_null_duration_words(transcription, words, recompute_text=True)
File "/data/p/venv-3.10/lib/python3.10/site-packages/whisper_timestamped/transcribe.py", line 1889, in remove_last_null_duration_words
raise RuntimeError(f"\"{text}\" not ending with \"{full_word}\"")
RuntimeError: " पर्टिएजार के लिए पर्टिएजार के लिए पर्टिएजार के लिए पर्टिएजार के लिए पर्टिएजार के लिए पर्टिएजार के लिए पर्टिएजार के लिए पर्टिएजार के लिए पर्टिएजार के लिए पर्टिएजार के लिए पर्टिएजार के लिए �" not ending with " लिए"