haifengl / smile

Statistical Machine Intelligence & Learning Engine
https://haifengl.github.io
Other
5.99k stars 1.12k forks source link

Avoid negative BM25 scores #683

Closed schneijan closed 3 years ago

schneijan commented 3 years ago

Description

This PR implements the suggestions of #682 by adjusting the IDF formula within the BM25.java class at three places. Due to this change, the expected value of the corresponding unit test changed slightly and is hence adjusted as well.

Please tell me if something is wrong with this PR or if you do not see a need for this formula change. Thanks in advance!

haifengl commented 3 years ago

Thanks!