Bug - "No active speech found in audio results"

So if the VAD does not find any audios it return an empty array and prints the message "No active speech found in audio results". So far so good.

But next fast-whisper runs into an error which I can not catch outside of the package because in line 523 in transcribe.py you use a torch.stack() which expects a non empty TensorList.

The expected behaviour should be that fast-whisper just returns an empty string if it does not find any speech in an audio.

The problematic code section is in the vad models merge_chunks function:

    if len(segments_list) == 0:
        print("No active speech found in audio")
        return []

SYSTRAN / faster-whisper

Bug - "No active speech found in audio results" #997