MahmoudAshraf97 / whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
BSD 2-Clause "Simplified" License
3.28k stars 272 forks source link

how to handle files with no Voice? #27

Closed Oheed911 closed 1 year ago

Oheed911 commented 1 year ago

I have data set of files, if I upload files that are of, lets say 10 seconds with no voice or some background noise, it gives error ValueError: All files present in manifest contains silence, aborting next steps

how should i handle this in the code?

MahmoudAshraf97 commented 1 year ago

there is a VAD in both whisper and NeMo to detect whether the audio file contains speech or not, so if the audio doesn't contain speech it will be skipped, what behavior are you expecting?

Oheed911 commented 1 year ago

yeah, the file is skipped, do you know, if it skips the diarization because of no speech, is there any programmetic check to know if it will skip the audio, for example, In some audio files that I have, there is 1 second segment of audio but nemo does not diarize it and skips it, I need to know at run time if the audio file is being skipped by nemo so I can return response accordingly. Thanks.

MahmoudAshraf97 commented 1 year ago

you can use MarbleVAD from nemo library to know what will be skipped