SYSTRAN / faster-whisper

Faster Whisper transcription with CTranslate2
MIT License
12.6k stars 1.05k forks source link

Bug - "No active speech found in audio results" #997

Closed asusdisciple closed 1 month ago

asusdisciple commented 2 months ago

So if the VAD does not find any audios it return an empty array and prints the message "No active speech found in audio results". So far so good.

But next fast-whisper runs into an error which I can not catch outside of the package because in line 523 in transcribe.py you use a torch.stack() which expects a non empty TensorList.

The expected behaviour should be that fast-whisper just returns an empty string if it does not find any speech in an audio.

The problematic code section is in the vad models merge_chunks function:

    if len(segments_list) == 0:
        print("No active speech found in audio")
        return []
MahmoudAshraf97 commented 2 months ago

Its fixed in #936