flashlight / wav2letter

Facebook AI Research's Automatic Speech Recognition Toolkit
https://github.com/facebookresearch/wav2letter/wiki
Other
6.39k stars 1.01k forks source link

Error during lexicon load #700

Open AlexandderGorodetski opened 4 years ago

AlexandderGorodetski commented 4 years ago

Hello,

I got following error:

I0617 13:56:40.737172 104 Decode.cpp:247] [Decoder] LM constructed. terminate called after throwing an instance of 'std::invalid_argument' what(): Unknown entry in dictionary: '360'

I guess it occurred because I have words like bold360 in my dictionary.

I guess that I should preprocess my dictionary and my text so that "bold360" will be replaced with "bold three sixty" ?

Do you have such preprocessing tool that does this job?

Thanks, Alexander.

tlikhomanenko commented 4 years ago

@AlexandderGorodetski

Do you have this word in your lexicon file "bold360"? If yes, what the sequence of tokens you set for it?

In practice it is better to preprocess numbers to the text for AM training. I am using this python lib to do so https://pypi.org/project/num2words/, should be easy to use.