alphacep / vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Apache License 2.0
7.43k stars 1.04k forks source link

run the vosk on worker task in nodejs #565

Open ilovebioz opened 3 years ago

ilovebioz commented 3 years ago

Hi,

I succeeded in running the nodejs simple example using FFmpeg. Now I would like to using Bree scheduler to execute the above example as a worker task. But whenever It comes to the below part the process automatically killed. Pls, help to show me what is the problem? and how can I fix it? thank you very much.

const libvosk = ffi.Library(soname, { 'vosk_set_log_level': ['void', ['int']], 'vosk_model_new': [vosk_model_ptr, ['string']], 'vosk_model_free': ['void', [vosk_model_ptr]], 'vosk_spk_model_new': [vosk_spk_model_ptr, ['string']], 'vosk_spk_model_free': ['void', [vosk_spk_model_ptr]], 'vosk_recognizer_new': [vosk_recognizer_ptr, [vosk_model_ptr, 'float']], 'vosk_recognizer_new_spk': [vosk_recognizer_ptr, [vosk_model_ptr, vosk_spk_model_ptr, 'float']], 'vosk_recognizer_new_grm': [vosk_recognizer_ptr, [vosk_model_ptr, 'float', 'string']], 'vosk_recognizer_free': ['void', [vosk_recognizer_ptr]], 'vosk_recognizer_accept_waveform': ['bool', [vosk_recognizer_ptr, 'pointer', 'int']], 'vosk_recognizer_result': ['string', [vosk_recognizer_ptr]], 'vosk_recognizer_final_result': ['string', [vosk_recognizer_ptr]], 'vosk_recognizer_partial_result': ['string', [vosk_recognizer_ptr]], });

solyarisoftware commented 3 years ago

your problem is not well described. May you explain better what you want to do? What do you mean with "worker task"? An external process? Or a worker thread?

ilovebioz commented 3 years ago

Hi,

I used Bree (a scheduler module of nodejs, one of special thing of this scheduler is it creates a worker_threads for each of task). const Bree = require('bree');

const bree = new Bree({ // logger: new Cabin(), root: false, //outputWorkerMetadata: true, jobs: [
{ name: 'taskProcess', path: path.join(global.STTSERVICE.jobPath, 'taskProcess.js'), interval: '6s', worker: { workerData: { info: global.STTSERVICE, } }
}, ], });

bree.start('taskProcess');

the taskProcess in the above code is similar to VOSK FFmpeg sample. It loads the dll, calls FFmpeg, and does the STT. By this structure, each job will be run on a worker thread (not the main process like the original example). Whenever it comes to the library declare:

const libvosk = ffi.Library(soname, { 'vosk_set_log_level': ['void', ['int']], 'vosk_model_new': [vosk_model_ptr, ['string']], 'vosk_model_free': ['void', [vosk_model_ptr]], 'vosk_spk_model_new': [vosk_spk_model_ptr, ['string']], 'vosk_spk_model_free': ['void', [vosk_spk_model_ptr]], 'vosk_recognizer_new': [vosk_recognizer_ptr, [vosk_model_ptr, 'float']], 'vosk_recognizer_new_spk': [vosk_recognizer_ptr, [vosk_model_ptr, vosk_spk_model_ptr, 'float']], 'vosk_recognizer_new_grm': [vosk_recognizer_ptr, [vosk_model_ptr, 'float', 'string']], 'vosk_recognizer_free': ['void', [vosk_recognizer_ptr]], 'vosk_recognizer_accept_waveform': ['bool', [vosk_recognizer_ptr, 'pointer', 'int']], 'vosk_recognizer_result': ['string', [vosk_recognizer_ptr]], 'vosk_recognizer_final_result': ['string', [vosk_recognizer_ptr]], 'vosk_recognizer_partial_result': ['string', [vosk_recognizer_ptr]], });

the main process exit.

I hope now everything is clearer.

Thank you very much!

solyarisoftware commented 3 years ago

Well,

You didn't specify the exit error of your main process. But now it's more clear what you want to do: you want to transcode with ffmpeg and transcript with Vosk files, using worker_threads with a scheduler on top (maybe you want to make a server architecture). What is not clear is WHY you want to proceed this way.


Unfortunately nodejs worker_threads fight with Vosk (memory) architecture. I explain why here below.

Please remember in Vosk the loaded language model possibly occupy a lot of RAM memory. E.g. the English language large model take ~3.3 GB (let's call this magnitude: M) ! So you want to load the model ONCE, otherwise, if you load the model IN each thread (or process) T, you will allocate T * M GB !

So if you want to delegate fffmpeg transcoding and speech to text tasks, I propose different approaches:

See also VoskJs, my nodejs Vosk wrapper, with server examples: https://github.com/solyarisoftware/voskJs/

ilovebioz commented 3 years ago

hi,

firstly, I would like to thank you for your kind explanation. I would like to make a restful server that completely does the same job as Voskjs but a bit different on logic. In my server, the API just receive the request and store them to a queue, a batch job will handle the STT processing. That is why I would like to make the FFmpeg STT a worker thread in Nodejs. if I develop the batch job on the main process, it will block the API server and the client can not send the request during it's working. Thank you again for your support, I will study solution 2 to solve my problem.

solyarisoftware commented 3 years ago

In my server, the API just receive the request and store them to a queue, a batch job will handle the STT processing.

If you want to build an ASR decoder server architecture, you probably want to take latencies as low as possible. Right? :-)

But if you use a job queue manager you are just serializing requests, delegating to a beckend system to fulfill requests. In that way you do not block the nodejs main thread (of the server) ok, but you do not solve the entire problem (minimize latency).

Of course all depends on cpu cores available in your host

That is why I would like to make the FFmpeg STT a worker thread in Nodejs.

Warning: you can't pass the Vosk Model using worker_threads because the Model object contains functions! See: https://github.com/alphacep/vosk-api/issues/502.

All in all, you get some info, but you did not detailed the Vosk issue, so I suggest to please close this issue and maybe reopen another with a well detailed problem related to Vosk.

BTW, if you find VoskJs useful, I appreciate a star there :)