kpu / kenlm

KenLM: Faster and Smaller Language Model Queries
http://kheafield.com/code/kenlm/
Other
2.5k stars 513 forks source link

Accessing n-gram frequencies from a large file (almost 10GB) #131

Closed leilaafsar closed 6 years ago

leilaafsar commented 6 years ago

I'm quite new to natural language processing . I have a very large text file . I need to get ngram(1,2,3,4,5 gram) by its frequency . Is it possible to get ngram by kenLM? I need that script.

kpu commented 6 years ago

Yes. You're welcome to read https://kheafield.com/code/kenlm/ which includes command lines.