renepickhardt / generalized-language-modeling-toolkit

Generalized Language Modeling toolkit
http://glm.rene-pickhardt.de
51 stars 17 forks source link

Standard Formats #11

Open renepickhardt opened 10 years ago

renepickhardt commented 10 years ago

the output of language models and n-grams should follow standard formats e.g.

weighted finite state transducer format (WFST) ARPA format

currently I am not sure if more formats exist. they should be researched and implemented.

lschmelzeisen commented 9 years ago

In our experimentation we found that we won't be able to conform with standard formats (namely APRA) I guess we won't have this for stable release then?

renepickhardt commented 9 years ago

yes our current tool will not provide an arpa file since a good decoder is shipped. I would still leave this bug open (maybe move it to a different milestone) since saving our model as an FST might still be an option also for saving space and improving runtime