Closed acc8518 closed 6 years ago
You should probably tokenize the text beforehand. https://github.com/moses-smt/mosesdecoder/blob/master/scripts/tokenizer/tokenizer.perl . It only has to end with a period to the extent that sentences naturally end with a period.
Hello! Thank you for providing the tool.
I successfully run the command line as follows:
bin/lmplz -o 5 <text >text.arpa
I am just curious that whether there is a format of the input . For example, one sentence in a line, should be ended with certain symbol such as '.' ?
I afraid that i did not follow the format and thus obtain a bad language model.