ggerganov / whisper.cpp

Port of OpenAI's Whisper model in C/C++
MIT License
35.27k stars 3.6k forks source link

Radio transcript #2507

Open shmightworks opened 3 days ago

shmightworks commented 3 days ago

I'm doing a fun project to transcribe radio broadcast. I'm taking 15 minutes recordings and feeding it to WhisperCPP. More often than not, I noticed, it passes through talking as [Music], and other times it doesn't output [Music] and the actual lyrics comes out. I don't mind lyrics coming out in the output, but I do mind talking being skipped and outputted as [Music]

Is there something I can do to improve the results? I'm guessing maybe where along the lines of using the prompt argument to feed in something.

Any suggestions? Thanks.

kth8 commented 20 hours ago

Which model are you using? Maybe try a larger one for better results.

shmightworks commented 19 hours ago

I'm just using: https://huggingface.co/ggerganov/whisper.cpp/blob/main/ggml-base.bin maybe I'll try ggml-medium.en.bin