kpu / kenlm

KenLM: Faster and Smaller Language Model Queries
http://kheafield.com/code/kenlm/
Other
2.52k stars 511 forks source link

how to reduce the influence of word frequency #437

Open antct opened 1 year ago

antct commented 1 year ago

Hi, thanks for your great work. Recently, I found that the final score of high-frequency words will be much higher than that of low-frequency words. After reading other issues, I found that kenlm has used the smoothing method to alleviate this problem. What I want to ask is how to further alleviate this problem, such as increasing the intensity of smoothing?