benfred / implicit

Fast Python Collaborative Filtering for Implicit Feedback Datasets
https://benfred.github.io/implicit/
MIT License
3.57k stars 612 forks source link

.recommend return different output when different 'N' parameter is specified for bm25 recommender #620

Closed saranggupta closed 1 year ago

saranggupta commented 2 years ago

I trained a bm25 nearest neighbor recommender. When I specify different 'N' parameter in the model.recommend() method, I get different scores and order for the recommendations. With ALS model, it returns the same recommendations for different N. Why would it return different rank and scores for different values of N for the bm25 nearest neighbor?

saranggupta commented 2 years ago

And what do the scores mean for bm25 while calculating similar items? The scores seem to be unbounded. My understanding is that it should be cosine distance which should be bounded, but the value are in 1000s?

benfred commented 1 year ago

@saranggupta - I think the different output might have been caused by this bug https://github.com/benfred/implicit/issues/627 . Can you try out the fix (either by installing one of the wheels from the artifacts here https://github.com/benfred/implicit/actions/runs/3517226535#artifacts, or by building from the main branch)?

benfred commented 1 year ago

Fix is in latest release - let me know if this is still a problem