yandexdataschool / nlp_course

YSDA course in Natural Language Processing
https://lena-voita.github.io/nlp_course.html
MIT License
9.83k stars 2.61k forks source link

Additional assert for count_ngrams #156

Open tempoden opened 1 month ago

tempoden commented 1 month ago

I have spent quite some time trying to find an error in my function to compute perplexity for the case n=1. I got 321 instead of desired 318.

It turns out that the mistake was in count_ngrams, and for the case n=1 I have counted EOS twice. In order to find a similar mistake earlier and spend a little bit less time then I did, I propose to add one more assert for this corner-case.

review-notebook-app[bot] commented 1 month ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB