sandrohanea / whisper.net

Whisper.net. Speech to text made simple using Whisper Models
MIT License
512 stars 78 forks source link

WithNoSpeechThreshold doesn't seem to do anything #96

Open AWAS666 opened 12 months ago

AWAS666 commented 12 months ago

Either I lack the understanding what this is supposed to do or it doesn't work. I tried a variety values from 0.1f to 10f and I still had hallucinations remaining in my output, so stuff like "you" repeatedly which seems to occur whenever an empty audio stream is passed to whisper as that has likely not been including in the model training data.

sandrohanea commented 12 months ago

Hello @AWAS666 , Indeed, the no speech Threshold is not supported yet in the underlying library :

https://github.com/ggerganov/whisper.cpp/blob/7b374c9ac9b9861bb737eec060e4dfa29d229259/whisper.h#L406

Once it will be implemented there, whisper.net should automatically support it but as a workaround until that will be added, recommend you to use WithProbabilities: https://github.com/sandrohanea/whisper.net/blob/2c13353b13e0a6d8ef38e5aa3cc77c7da77661eb/Whisper.net/WhisperProcessorBuilder.cs#L494

And check for confidence level on each segment with Probability property: https://github.com/sandrohanea/whisper.net/blob/2c13353b13e0a6d8ef38e5aa3cc77c7da77661eb/Whisper.net/WhisperProcessorEvents.cs#L25C9-L25C20

Please note that it will be probably changed to WithConfidence and ConfidenceLevel in the upcoming versions.