ggerganov / whisper.cpp

Port of OpenAI's Whisper model in C/C++
MIT License
33.93k stars 3.43k forks source link

How to increase speech to text speed when using whisper cpp? #1635

Open ITHealer opened 8 months ago

ITHealer commented 8 months ago

Currently I am using whisper_tiny.en.tflite for my android project. And the inference time for speech to text is quite long. https://github.com/usefulsensors/openai-whisper/tree/main/android_app/Whisper-TFLIte-Android-Example


I/flutter ( 3449): Model name: /assets/whisper-tiny.en.tflite I/flutter ( 3449): Load success I/flutter ( 3449): Load model time 0:00:03.274847 I/htotext_exampl( 3449): ProcessProfilingInfo new_methods=1340 is saved saved_to_disk=1 resolve_classes_delay=8000 I/tflite ( 3449): Initialized TensorFlow Lite runtime. W/1.ui ( 3449): type=1400 audit(0.0:15783): avc: denied { read } for name="u:object_r:vendor_default_prop:s0" dev="tmpfs" ino=14533 scontext=u:r:untrusted_app:s0:c127,c257,c512,c768 tcontext=u:object_r:vendor_default_prop:s0 tclass=file permissive=0 E/libc ( 3449): Access denied finding property "ro.hardware.chipname" I/tflite ( 3449): Created TensorFlow Lite XNNPACK delegate for CPU. I/flutter ( 3449): Result process audio file: I love you. I/flutter ( 3449): Processing time 0:00:11.894501


Is there any way to shorten the time, please help me? Thanks!

ITHealer commented 8 months ago

@ggerganov Can you give me directions?

dgm3333 commented 8 months ago

buy a more powerful processor? Given that you haven't even given basic details of your setup it seems to me you are being pretty unreasonable to target the primary dev for something which is always going to be limited by hardware and is a totally different repo to this one :-(

themanyone commented 8 months ago

I am the author of Caption Anything and Whisper Dictation. In #1653, I mentioned how speed can be improved by compiling Whisper.cpp with acceleration like CLBlast or cuBLAS, using the tiny models, or employing a client-server setup. Connect slow clients like android to faster computers running Whisper.cpp ./server instances on the network.

I also ran into a case where generation was inexplicably slow, and it was no longer using the GPU. The solution was to reload the video driver module (or reboot). (I was testing -allow-unsupported-compiler thinking I could outsmart the system).

sudo modprobe -r nvidia_uvm; sudo modprobe nvidia_uvm

jordibruin commented 7 months ago

Android phones won't be able to run this fast.

zhouwg commented 5 months ago

Xiaomi 14 could do that(with Xiaomi's proprietary 6B device-side AI model) because it contains a very powerful mobile SoC