renepickhardt / generalized-language-modeling-toolkit

Generalized Language Modeling toolkit
http://glm.rene-pickhardt.de
52 stars 17 forks source link

implement the novel mod kneser ney smoother #27

Closed renepickhardt closed 10 years ago

renepickhardt commented 10 years ago

I guess that the existing mod kneser ney class is not as efficient as it could be (profile this)

In any case it cannot work with our novel sequences out of the box.

So we ether need to integrate the new smoothing techniques into the existing class or more likely reimplement kneser ney smoothing for our novel approach.

The novel kneser ney class should be able to do various smoothing steps (potentially at the same time?)

Results should be logged to different files according to the version of the smoother (e.g. mod-kneser-ney-posglm-5.txt or mod-kneserney-originalglm-5.txt) names.

estimates for the probabilities should always be stored as logs of probabilities.

lschmelzeisen commented 10 years ago

We will create a whole hierarchy of smoothers. Different Issues exist for all of them. See milestone "Implement Smoothers".