movielens example: recall@k vs. precision@k vs. auc

deckikwok commented 4 years ago

Hi

For the movielens example,

1) is it all right for Precision@K to have such a low score on test dataset i.e. 0.11 for test vs 0.6 on train set? Is this a case of it is better to have recommended than not?

2) is it better to normalize it with the maximum Precision@K that each user can attain for business evaluation and even for model comparison?

3) am i right to say that Precision@K for LightFM generally tends to suffer as compared to the Surprise library? This is because LightFM will recommend K number of recommendations regardless of actual rating - since LightFM uses relative scoring - whereas for the Surprise library, there is the notion of rating threshold and the latter might recommend less than K recommendations due to a minimum threshold requirement?

4) any reason why recall@k is not computed but LightFM has it implemented?

apologies for the many questions asked.

Sachet12345 commented 3 years ago

Hi, did you get the answers to your questions?

Chen-Cai-OSU commented 3 years ago

Hi,

I am also interested in understandiny why matrix factorization performs poorly on Movielens data in terms of precison@k and recall@k. https://github.com/microsoft/recommenders/blob/main/examples/06_benchmarks/movielens.ipynb here listed some benchmark dataset where another MF method ALS is also doing poorly. Is this fair to conclude that most MF methods does not work well on moivelens?

lyst / lightfm

movielens example: recall@k vs. precision@k vs. auc #544