Open ShellingFord221 opened 4 years ago
And also why do you use maximum likelihood estimation? I think the easiest way is just to count the number of each class in the held-out clean dev set, since the distribution in 20% sampled test data can represent the distribution in the whole test set. Thanks!
Hello @ShellingFord221 . Thank you for your questions!
Around these lines we load a clean dev set (cdev_dset
) and store its distribution in the variable cdev_lp
.
For discrete random variables, we believe maximum likelihood estimation will give frequency as the estimated distribution, which is what we do here, and is equivalent to counting method you mentioned.
I hope this answers your question :-)
Hi, in your paper, for the estimation of test distribution p(r|Dm), you use "maximum likelihood estimation on a held-out clean dev set, which is a 20% sample from test set". I wonder how you do this in your code? Thanks!