alphacep / vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Apache License 2.0
7.92k stars 1.1k forks source link

Increasing performance speed and Accuracy #490

Open problemSolvingProgramming opened 3 years ago

problemSolvingProgramming commented 3 years ago

Hi,

I have tested vosk api on cpu computer . It takes about 90 seconds to recognize a 60 seconds microphone . During processing it takes about 25% of CPU ( I think it runs only on one core). i use intel cpu core i7 , 4.3 GHZ , os windows 64 bit

Is there any way to use more cores and speed up speech recognition onspeech to text ? The only way that I'm thinking about is overclocking. I was wondering if exists an environment variable for vosk or kaldi to define number of processing thread.

nshmyrev commented 3 years ago

It takes about 90 seconds to recognize a 60 seconds microphone

It is much faster. Probably you are doing something wrong, like you don't call Result in process.

madkote commented 3 years ago

the accuracy of vosk decoder is already good (more accurate then available kaldi recognizer) - probably due to kaldi customization?

etlweather commented 2 years ago

@problemSolvingProgramming On a Core i5-8500 3GHz CPU, I transcribe a 20min file in 2min. This is done using the test_ffmpeg.py example file with no changes, and the daanzu model.

Note that this does not include the time it takes to load the model in memory but that hardly adds time (just a few seconds) unless you have a very slow hard drive.