Closed LB-Yu closed 5 years ago
(terribly sorry for the delay) The N+1 term stands for the number of successor words with nonzero counts, as depicted on the slide in the first formula. Here's the full slide
In other words, part 2 is exactly |{w': c(w_{i-1, w'}) > 0}| just like the wiki formula suggests.
Feel free to reopen if you have further questions... if you still remember this exists
Hi! I have a problem of the Kneser-Ney smoothing formula. The formula for calculating lambda on the slide is shown below.
What is the difference between the part 1 and the part 2? I think they both calculated the number of times w_i-n+1, ... ,w_i-1 and w_i co-occur. And the formula on wiki page is: