MahmoudAshraf97 / whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
BSD 2-Clause "Simplified" License
2.53k stars 243 forks source link

Diarization process is abruptly cut short #96

Closed paul-arg closed 9 months ago

paul-arg commented 9 months ago

When I try to diarize a small 30 seconds conversation, with the command

python3 diarize.py -a /path/to/my/file/interview_short.mp3 --whisper-model medium --device cpu

I get this result:

image

Looks like the output is cut short after warnings.warn( I am running the script on Ubuntu 22.04.3 on WSL2.

Do you have any ideas?

Thank you for your help.

paul-arg commented 9 months ago

My computer has 16gb or RAM, and WSL2 is alloted 8gb of them

MahmoudAshraf97 commented 9 months ago

I don't see any error, please check if the outputs exist and if the folder named temp_outputs exists, if outputs exist and the folder doesn't, then the program finished correctly

paul-arg commented 9 months ago

Thank you for your answer

I tried again with the same command and now I have

KeyError: 'text'

image

MahmoudAshraf97 commented 9 months ago

Pull the latest version of the code, reinstall the requirements as there was some breaking changes

paul-arg commented 9 months ago

Thanks, I just did it and this is my result

image

There is no temp_outputs folder but I cannot find the output

MahmoudAshraf97 commented 9 months ago

the results should be in the same directory as the audio

paul-arg commented 9 months ago

I works! I am closing this issue