opensource-spraakherkenning-nl / Kaldi_NL

Code related to the Dutch instance and user groups of the KALDI speech recognition toolkit
http://www.opensource-spraakherkenning.nl
Apache License 2.0
64 stars 16 forks source link

decode.sh is emtpy #2

Closed wmelder closed 7 years ago

wmelder commented 7 years ago

I figured maybe decode.sh wasn't written, because the configuration wasn't completed. After choosing configuration options in the dialog that was triggered with ./configure.sh the script doesn't pass the screen with the text:

Const LM generation
Creating ConstLM for rescore, This may take a while
Creating ConstLM

Any ideas what went wrong?

Technical information: OS: Red Hat Enterprise Linux Server release 7.2 (Maipo) CPU: 4 x Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz (2 cpu cores) MP3 plugin for Sox wasn't installed. java version "1.8.0_111" Java(TM) SE Runtime Environment (build 1.8.0_111-b14) Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode) Memory: 32GB

Models chosen: AM: NL/UTwente/HMI/AM/CGN_large/nnet3_online/tdnn_cleaned LM: v1.0/KrantenTT.3gpr.kn.int.arpa.gz Rescore LM: NL/UTwente/HMI/LM/KrantenTT & v1.0/KrantenTT.4gpr.kn.int.arpa.gz

Michel-NL commented 7 years ago

This takes some hours to complete. If you open up a new console and type in "top" you can see that there is some activity going on. Would be nice to have some information progress bar to know it is doing something in the background :)

wmelder commented 7 years ago

I see that this dialog was started from local/configure_decode.sh and that this is one step before decode.sh creation is started.

wmelder commented 7 years ago

I don't see a process in top. My configure.log gives this at the end of the file:

**Creating models/NL/UTwente/HMI/LM/KrantenTT/v1.0/LG_KrantenTT.3gpr.kn.int_UTwente_HMI_lexicon/lang/lexicon.txt from models/NL/UTwente/HMI/LM/KrantenTT/v1.0/LG_KrantenTT.3gpr.kn.int_UTwente_HMI_lexicon/lang/lexiconp.txt mkgraph.sh: expected models/NL/UTwente/HMI/LM/KrantenTT/v1.0/LG_KrantenTT.3gpr.kn.int_UTwente_HMI_lexicon/L.fst to exist mkgraph.sh: expected models/NL/UTwente/HMI/LM/KrantenTT/v1.0/LG_KrantenTT.3gpr.kn.int_UTwente_HMI_lexicon/L.fst to exist cat: models/NL/UTwente/HMI/LM/KrantenTT/v1.0/LG_KrantenTT.3gpr.kn.int_UTwente_HMI_lexicon/Const_UTwente_HMI_KrantenTT_v1.0_KrantenTT.4gpr.kn.int/oov.int: No such file or directory grep: models/NL/UTwente/HMI/LM/KrantenTT/v1.0/LG_KrantenTT.3gpr.kn.int_UTwente_HMI_lexicon/Const_UTwente_HMI_KrantenTT_v1.0_KrantenTT.4gpr.kn.int/words.txt: No such file or directory grep: models/NL/UTwente/HMI/LM/KrantenTT/v1.0/LG_KrantenTT.3gpr.kn.int_UTwente_HMI_lexicon/Const_UTwente_HMI_KrantenTT_v1.0_KrantenTT.4gpr.kn.int/words.txt: No such file or directory utils/build_const_arpa_lm.sh: <s> and </s> symbols are not in models/NL/UTwente/HMI/LM/KrantenTT/v1.0/LG_KrantenTT.3gpr.kn.int_UTwente_HMI_lexicon/Const_UTwente_HMI_KrantenTT_v1.0_KrantenTT.4gpr.kn.int/words.txt

Is this an indication that the whole process stopped?

Michel-NL commented 7 years ago

Where did you find that log? Installing a fresh Red Hat server at the moment and going for a re-install.

update: found the logs..needed to refresh

wmelder commented 7 years ago

This is in the Kaldi_NL directory, same as where configure.sh lives. My guess right now is that something went wrong with downloading the starterpack earlier, so I deleted my models directory and restarted configure.sh. Now it starts to download the models again. I have detached from my screen now because it says download will take one and a half hour. Will report later.

Michel-NL commented 7 years ago

If I look at the Ubuntu (second server) configure.sh logfiles, then there are no errors and looks ok to me. Laurens mentioned a few days ago to delete the models directory and let the configure script create the folder. Do not create it manually by hand or there could me something wrong with the rights on the folder.

wmelder commented 7 years ago

So the model directory should only be set in de dialog, but not created using mkdir. Will try that. Thanks!

wmelder commented 7 years ago

This time the decode.sh script is right!