Open snoesberger opened 3 months ago
This is not really an Opencast problem, it comes from whisper-ctranslate2. In any case, a subtitle file should be generated even though there is no recognisable speech in the audio. This issue has already been addressed in the faster-whisper repository (which is a dependency of whisper-ctranslate2): https://github.com/SYSTRAN/faster-whisper/pull/895. So far there is no new version of faster-whisper that includes this PR. However, as the change is quite simple, it can easily be applied manually.
Describe the bug When I upload a video with an audio track that contains no spoken words, and I run a workflow to generate subtitles using Whisper-ctranslate2, the speech-to-text workflow step fails with the error message "Whisper produced no output".
To Reproduce Steps to reproduce the behavior:
--vad_filter True
Expected behavior From the user's point of view, this case should not generate an error because whisper worked as expected. If there are no spoken words in the audio, no subtitles should be generated. In this case, the workflow should not fail, but a warning should be reported.
Server environment: