This is the repo for swyx's blog - Blog content is created in github issues, then posted on swyx.io as blog pages! Comment/watch to follow along my blog within GitHub
brew install ffmpeg # takes a while!
git clone https://github.com/ggerganov/whisper.cpp
cd whisper.cpp
make
cd models
./download-ggml-model.sh base.en or ./download-ggml-model.sh medium.en # see full model list here https://github.com/ggerganov/whisper.cpp/tree/master/models
Steps
convert your audio file to a 16khz .wav file: ffmpeg -i SOURCE_FILE.wav -ar 16000 output.wav
THEN you can do ./main -m models/ggml-medium.en.bin -f output.wav >> output.txt inside of the whisper.cpp directory, which pipes the transcription into output.txt
at a rate of about 3 minutes of input: 2 minutes to transcribe (for the medium - 769M param model)
or at a rate of about 12 minutes of input: 1 minute to transcribe (for the base - 74M param model)
slug: whisper-for-podcasts category: note
Prerequisites
these are onetime setup things
Steps
ffmpeg -i SOURCE_FILE.wav -ar 16000 output.wav
./main -m models/ggml-medium.en.bin -f output.wav >> output.txt
inside of thewhisper.cpp
directory, which pipes the transcription intooutput.txt