Closed ProducerMatt closed 1 year ago
@ProducerMatt, it seems like the resulting segments does not have timings or was it a silent video maybe ?! Could you please provide the media file you are using so I can debug the issue ? The media files I am using for testing on my end work without any problem ?
@abdeladim-s Thanks for the response.
The media isn't silent, and I got a great srt out of it when I used sentence splitting. You can find the media file here under Marble Hornets Season 1.mp4
This bug occurs regardless of model size, so for testing you can use tiny.en
.
Thanks @ProducerMatt for providing the file. The problem is the file is 90 minutes long, the transcription will never end on my descent machine :sweat_smile: I have tested some random small segments and it seems good.
What I did for now is to catch the bug and you will get a warning when the program reaches that word. Please rebuild the docker image with the new update and give it a test on your end.
You will now get the resulting srt file, just let me know the fragment of the media file causing this error, so we can investigate why!
You rock, thanks so much for the quick support. Here's what happened when I ran it:
[+] Processing file: /media_files/Marble Hornets Season 1.mp4
[22:04:14] WARNING Something wrong with {'word': '20'} whisperX_model.py:151
WARNING 'start' whisperX_model.py:152
WARNING Something wrong with {'word': '15'} whisperX_model.py:151
WARNING 'start' whisperX_model.py:152
WARNING Something wrong with {'word': '20'} whisperX_model.py:151
WARNING 'start' whisperX_model.py:152
WARNING Something wrong with {'word': '10.'} whisperX_model.py:151
WARNING 'start' whisperX_model.py:152
WARNING Something wrong with {'word': '12'} whisperX_model.py:151
WARNING 'start' whisperX_model.py:152
[+] Subtitles file saved to: /media_files/Marble Hornets Season 1.srt
Here's a .tar.gz
of the .srt
, which looks completely fine to me. https://clbin.com/Zmxv05
EDIT: sorry, the uploaded archive is truncated. If you can't open it, I'll have to figure out some other way to host it. I could also send it with croc
if you have that.
It's ok, the srt
file won't give much details about the cause of the problem, I should've printed the list to see where those "bad words" are located, so we can extract those segments.
But from the warnings, it seems like whisperX
is generating words without timings (probably the words with only numbers) which I think is a bug from their end!
Anyways, I think the bug is handled gracefully for now, let me know if you find any other issue or if you need help with anything else :)
Thanks for your help! This is probably the issue: https://github.com/m-bain/whisperX/issues/349
Yes, you are right, it is the same issue. Do you think it is worth it to use the proposed solutions to fix the issue, or just leave it as it is now until the WhisperX maintainers fix it ?
@abdeladim-s Sorry for not responding.
Implementing the fix would be nice, but it's maintainer's choice since they may put one in any day. 🙂
Also I've tried the feature, and I'm not clambering to use it right now, haha. It's literally word-per-subtitle, meaning you can barely follow it. If it was like YouTube where it was filling out the subtitle as it was spoken, or it highlighted each word as it was spoken, I would have found use in it. I think another backend has that.
No problem @ProducerMatt,
Yes you are right :)
In that case I will leave it as is and will wait the maintainers to fix it in their next update.
Running the latest version from the docker image.
It ran for the expected length of time, so it probably finished the encoding and died at the end.