argmaxinc / WhisperKit

On-device Speech Recognition for Apple Silicon
https://takeargmax.com/blog/whisperkit
MIT License
3.17k stars 268 forks source link

Support distil whisper models #88

Closed ZachNagengast closed 6 months ago

ZachNagengast commented 6 months ago

Distil whisper models are a great addition to the standard openai models especially for smaller devices, and with distil large v3 out now, its a great time to add support for them.

Important note: This required some significant changes to how we search for models, specifically that we now require the full model name like "openai_whisper-large-v3" rather than just "large-v3" in order to use the convenience methods for downloading a model without doing it manually using the modelFolder param.

Swift

let pipe = try? await WhisperKit(model: "distil-whisper_distil-large-v3")

or

let pipe = try? await WhisperKit(model: "distil*large-v3")

CLI

swift run whisperkit-cli transcribe --model "large-v3" --model-prefix "distil" --audio-path ~/your-audio.mp3