Open NayamAmarshe opened 1 year ago
I won't be adding Whisper support for now because it requires really powerful hardware and consumes too much system resources. My main PC can't even handle it so it would be a real challenge to implement and test it. Adding extra languages is definitely something I want to do at some point but it'd require training new models
@abb128 probably you can consider whisper.cpp project, which uses the same trained models from OpenAI::Whisper but it's implemented in cpp and claims to run on a potato (rpi3/4)
@kha84 That's the one I tried on my desktop but even with tiny model it consumed 100% of my CPU and didn't run fast enough for realtime. I believe by "run on a potato (rpi3/4)" maybe it's meant that it runs at all, not that it runs faster than realtime speeds, unless I did something horribly wrong. If it does indeed run realtime on pi then please let me know.
@kha84 That's the one I tried on my desktop but even with tiny model it consumed 100% of my CPU and didn't run fast enough for realtime. I believe by "run on a potato (rpi3/4)" maybe it's meant that it runs at all, not that it runs faster than realtime speeds, unless I did something horribly wrong. If it does indeed run realtime on pi then please let me know.
Yeah sorry, I take my words back. I played with whisper.cpp a while, the aprilasr provides much more consistent results at a fraction of CPU cost.
I did some more reading and found that it can indeed run in realtime on Pi 4 and on my computer as well by adjusting some parameters in the stream
example program: https://github.com/ggerganov/whisper.cpp/discussions/166
So maybe this could be viable after all, but I do find the latency a bit lacking
What amount of training data is needed to add a new language? Would love to see support for Norwegian Bokmål (nb) and Norwegian Nynorsk (nn).
Would be amazing if we could do that, I wouldn't mind the latency to be honest, if it worked on other languages.