corticph / prefix-beam-search

Code for prefix beam search tutorial by @labodk
https://medium.com/corti-ai/ctc-networks-and-language-models-prefix-beam-search-explained-c11d1ee23306
184 stars 37 forks source link

Input formats #3

Open neonlight1203 opened 5 years ago

neonlight1203 commented 5 years ago

I have my corpus in plain text and language model in .arpa format generated from KenLM. How can I input those to the algorithm?

aayushkubb commented 5 years ago

Hi, I have the same question, is there any way around to use .arpa files?

shreyas1998 commented 4 years ago

Hi @bulls-i @aayushkubb , You can use arpa (https://github.com/sfischer13/python-arpa ), to convert the .arpa files to a dictionary.
Or you can read the arpa files using simple file handling techniques in python without the use of any libraries. Refer to section 7.2 for more details(https://docs.python.org/3/tutorial/inputoutput.html)