argmaxinc / WhisperKit

On-device Speech Recognition for Apple Silicon
https://takeargmax.com/blog/whisperkit
MIT License
3.17k stars 267 forks source link

Fix language detection #133

Closed jkrukowski closed 4 months ago

jkrukowski commented 4 months ago

While working on audio chunking I noticed that sometimes language detection is off. It can happen that language is detected as <|nocaptions|> which might result in a whole 30s segment being discarded.

Screenshot 2024-05-02 at 17 34 45

This PR fixes that by defaulting to "en" when the language detection is confused.