alphacep / vosk-android-demo

Offline speech recognition for Android with Vosk library.
Apache License 2.0
740 stars 198 forks source link

Using custom model tdnn_lstm in vosk? #177

Closed mlexplore1122 closed 2 years ago

mlexplore1122 commented 2 years ago

Hi, I already have model training with nnet3 tdnn_lstm and output model in server has structure like this image: image And I try to replace folder: "model-en-us" in assess with my model's file like this image: image But when runing it failed: image So I wonder, Does vosk support tdnn_lstm, if not, what's wrong with my setup to run own model. Thank you

nshmyrev commented 2 years ago

You don't have conf/model.conf in your model. It is mandatory. You need to create that file. You can copy from another model.

mlexplore1122 commented 2 years ago

tks @nshmyrev I have add conf/model.conf from another model. But With my model, I just not using only mfcc, it's also using pitch for feature. So I have file online.conf: image

and file online_cmvn.conf: image

and the last is file online_pitch.conf image I don't know how to copy all conf to file model.conf or need do something for it;'s work. In case I do nothing, Model will failed when running because it's not understand iput is dimension with 40 of mfcc, and 4 of pitch. Thanks

nshmyrev commented 2 years ago

You need to create conf/pitch.conf then as described in https://alphacephei.com/vosk/models#model-structure

mlexplore1122 commented 2 years ago

tks you. I will try it.

mlexplore1122 commented 2 years ago

My pitch cofig has same like this. https://github.com/kaldi-asr/kaldi/blob/master/egs/csj/s5/conf/online_pitch.conf And I try move this config to file conf/pitch.conf image But it does'nt understand all option in pitch.conf image Can you check it @nshmyrev . Tks

nshmyrev commented 2 years ago

Leave just --add-raw-log-pitch and probably --delay

mlexplore1122 commented 2 years ago

I try to remove each option in list, but the important option is --add-raw-log-pitch=true failed, It's show Invalid option --add-raw-log-pitch=true in config file. If I remove this option, it not show error, but when running, error running show the input feature has dimension is 43, but model expectation 44. So Still can't run this. :(

nshmyrev commented 2 years ago

Ok, so I've just pushed this fix https://github.com/alphacep/vosk-api/commit/ad546a8f1a915ee166407e8d82c8cd2cc9cb8ec0 which should help you. But you need to rebuild the library yourself.

mlexplore1122 commented 2 years ago

ok, It's work, tks @nshmyrev very much