Open ammyt opened 4 years ago
The main tool we've been using to manually add dictionary entries is
speech_lex_edit.py
which is mentioned here:
Actually, after closer inspection, I can't seem to be running any of the python scripts. I seem to be getting that syntax error at {var_exp:2.2} in nltools/plotting.py. I can't really figure out why and I'm really sorry if this is a beginner's question
I managed to solve the issue by building and installing py-nltools, since I was using the wrong nltools version, now I can finally use the scripts.
I have been trying to add domain specific (medical) words into the german model, but I unfortunately couldn't find an explanation on how to do that. I tried following: https://chrisearch.wordpress.com/2017/03/11/speech-recognition-using-kaldi-extending-and-using-the-aspire-model/
I was able to create the merged lexicon.txt and lm.arpa but I couldn't complete this because of the different nonsilence phones
--> ERROR: phone "c" is not in {, non}silence.txt (line 38) --> ERROR: phone "au" is not in {, non}silence.txt (line 44) --> ERROR: phone "au" is not in {, non}silence.txt (line 45) ...
I then tried using speech_kaldi_adapt.py from here, but I get
Traceback (most recent call last): File "speech_kaldi_adapt.py", line 35, in <module> from nltools import misc File "build/bdist.linux-x86_64/egg/nltools/__init__.py", line 16, in <module> File "build/bdist.linux-x86_64/egg/nltools/analysis.py", line 16, in <module> File "/usr/local/lib/python2.7/dist-packages/nltools-0.3.20-py2.7.egg/nltools/plotting.py", line 797 ax[0].set_title(f"Component: {component}/{len(output['components'])}, Variance Explained: {var_exp:2.2}", fontsize=18) ^ SyntaxError: invalid syntax
I have been looking for a really long time for a way to add new words into one the existing models, but I unfortunately cannot find enough documentation. My goal is to add some new words and then adapt the grammar (which I have successfully done).I would really appreciate pointers in the right direction!