kpu / kenlm

KenLM: Faster and Smaller Language Model Queries
http://kheafield.com/code/kenlm/
Other
2.51k stars 512 forks source link

Input Format #93

Closed Doreenruirui closed 7 years ago

Doreenruirui commented 7 years ago

Should I put each sentence in a line in the input of the model?

kpu commented 7 years ago

Yes. Here's a sentence splitter: https://github.com/moses-smt/mosesdecoder/blob/master/scripts/ems/support/split-sentences.perl

Doreenruirui commented 7 years ago

Thank you very much!