srvk / lm_build

Adapting your own Language Model for Kaldi
http://speechkitchen.org/kaldi-language-model-building/
64 stars 13 forks source link

Missing Script : ctc_compile_dict_token.sh in Utils folder #3

Closed Thunderlbc closed 6 years ago

Thunderlbc commented 6 years ago

Hi~ I've noticed code: ../utils/ctc_compile_dict_token.sh in main executing script run_adapt.sh (line 66), but there doesn't exist a script under utils folder named this.

should I cp from somewhere else?

Please help to check this ~

Best regards,

Li

riebling commented 6 years ago

The lm_build is meant to be installed alongside Eesen which may be why the ../utils/ folder is not present for you. So yes, if you are not using Eesen, the lm_build is meant to go with it, in order to re-train a language model compatible with an Eesen experiment such as tedlium. See http://www.github.com/srvk/eesen and http://www.github.com/srvk/eesen-transcriber (a tool that uses pre-trained Eesen tedlium models to transcribe speech)

The intention of this lm_build tool is to supplement the above systems; allowing you to extend the LM with new vocabulary http://speechkitchen.org/kaldi-language-model-building/

riebling commented 6 years ago

but: if all you want is the utils/ folder, it can be found here: https://github.com/srvk/eesen/tree/master/asr_egs/wsj/utils

Thunderlbc commented 6 years ago

Thank you @riebling ~ currently I'm using traditional Kaldi instead of Essen, so I intend to use lm_build as adaptation tool for Customized Language Model during Decoding.

gray4what commented 3 years ago

update KALDI_ROOT with your kaldi folder in path.sh