How to convert a static model to a dynamic one

alphacep / vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

Apache License 2.0

7.99k stars 1.11k forks source link

How to convert a static model to a dynamic one #545

Closed victr-bld closed 2 years ago

victr-bld commented 3 years ago

Hi everyone, First of all, I am totally new in the ASR scene and english is not my mother language so I apologize.

So I understood that if a model has a Gr.fst it's a dynamic one and that's mean that its uses can be restricted (like in the test_words.py am I right??). In the application of VOSK, I need to restrict as much word as I can to save some space and speed up the whole process (for an embedded system) but is there is a way to convert the much more accurate model in dynamic-resctricted ones?

I looked a lot on the adaptation page (https://alphacephei.com/vosk/adaptation) and on the paper of Krisztián Varga (https://chrisearch.wordpress.com/2017/03/11/speech-recognition-using-kaldi-extending-and-using-the-aspire-model/) but (I think) find nothing...

If someone could help me understand deeply the implication of these models or their modifications this would be very cool!

Sorry for probably wasting your time and thanks a lot for the quality of your work.

nshmyrev commented 3 years ago

Hello

If a model has a Gr.fst it's a dynamic one and that's mean that its uses can be restricted (like in the test_words.py am I right??).

Yes

but is there is a way to convert the much more accurate model in dynamic-resctricted ones?

If you have all required files for the model (final.mdl, tree, phones), you can build whatever graph you like - static or dynamic.

but (I think) find nothing

Dynamic graphs are not mentioned there. Static models are build with mkgraph.sh script and dynamic graph with mkgraph_lookahead.sh script. The kaldi repo has an example.

Let me know if you have further questions.

victr-bld commented 3 years ago

Hello,

Thanks a lot for the answer! I just had a little trouble with the script. For the test, I am using the US English Kaldi APIRE model (3,2GB) and run the script by doing :

bash mkgraph_lookahead.sh /directory/model-us/ /directory/model-us/am/ /directory/model-us/graph/

Getting an error about the L_disambig.fst, I tried to continue by using the data in kaldi/egs/mini_librispeech/s5/data/lang and it seems to work. (Am I doing something wrong?)

At this point, I have the following error :

mkgraph_lookahead.sh: ligne 93: tree-info : commande introuvable (unknown command) Error when getting context-width

Can I have some advice or source to dig into? Thanks a lot and sorry for my beginner's question :)

Have a nice day.

nshmyrev commented 3 years ago

This error means path to kaldi binaries in path.sh is not configured. It can not find kaldi binary.

victr-bld commented 3 years ago

Ok I found my pb, and i feel totally stupid now. (For those who have the same problem, try recompiling Kaldi or simply launch the mkgraph_lookahead.sh from the s5 directory, by doing utils/mkgraph_lookahead.sh). I now have the basic "kaldi::KaldiFatalErrorERROR: FstHeader::Read: Bad FST header: standard input" but I saw few issue with that so that's ok. Thanks again, I'll let you know if I have further questions.

victr-bld commented 3 years ago

After some research, I still didn't find the solution to solve this error. However, I think the problem is at the "fstdeterminizestar" line and seems to be linked to my L_disambig.fst. Is it because it come from kaldi/egs/mini_librispeech/s5/data/lang? If yes, what am I suppose to do to recreate a new one?

Thanks a lot!

nshmyrev commented 3 years ago

Is it because it come from kaldi/egs/mini_librispeech/s5/data/lang?

No, mini_librispeech lang is different from aspire. If you work with apsire model you need to pick the lang from aspire model.

If yes, what am I suppose to do to recreate a new one?

It depends on the model you are working on.

victr-bld commented 3 years ago

Ok, so now I got this one :

ERROR: GenericRegister::GetEntry: olabel_lookahead-fst.so: cannot open shared object file: No such file or directory FATAL: Fst::Convert: Unknown FST type olabel_lookahead (arc type standard)

I feel like I am next to the end but still need your help for that tiny piece of error remaining... Where can be this from?

nshmyrev commented 2 years ago

ERROR: GenericRegister::GetEntry: olabel_lookahead-fst.so: cannot open shared object file: No such file or directory

It is a problem about LD_LIBRARY_PATH