Fix language detection - Githubissues

argmaxinc / WhisperKit

On-device Speech Recognition for Apple Silicon

https://takeargmax.com/blog/whisperkit

MIT License

3.17k stars 267 forks source link

Fix language detection #133

Closed jkrukowski closed 4 months ago

jkrukowski commented 4 months ago

While working on audio chunking I noticed that sometimes language detection is off. It can happen that language is detected as <|nocaptions|> which might result in a whole 30s segment being discarded.

This PR fixes that by defaulting to "en" when the language detection is confused.