Closed petrov826 closed 2 years ago
A super kind kaggle user gave me an information that MAP of RecBole and that of the competition are different. I'll define custom MAP and check the gap again.
If you have other information, please let me know.
I checked the formula of MAP in the competition, as shown below. Could you explain the difference between n
and 12
?
Thank you for your reply @guijiql!
For most cases, n is 12. In this competition, we are asked to recommend top 12 items per customer. We CAN recommend only top 3 items if we want. But we will lost the chance of getting higher score, so no one do that.
This is the official comment from competition organizer.
There is never a penalty for using the full 12 predictions for a customer that ordered fewer than 12 items; thus, it's advantageous to make 12 predictions for each customer.
Thank for your explaination. If n is 12, the formula for MAP in the competition is completely the same as the calculation in recbole. There is a typo in the documentation. i.e. min(|\hat R(u)|, K)
should be min(|R(u)|, K)
. I don't think your problem is caused by difference of MAP formula.
Thank you too @guijiql
No difference between MAP formula is a great newsš With MAP@k is calculated correctly, my model is good enough for this competition!
Now, I'm wondering that my usage of full_sort_topk()
might be wrong. Is there a way to get top k items by using model.predict()
or something? If I use the predicting functions that are used to evaluate my model, the gap might disappear.
The competition ended and I got almost the same score as LB one. It seems that both RecBole score and kaggle's LB score are correct.
I still don't know why there's a huge gap between them. But I guess that it is coming from domain shift or something. Because fashion trends are changing rapidly all the time.
Anyway, we should investigate more this issue by using more famous dataset like MovieLens.
Describe the bug I got map@12 = 0.148 on eval set by running
trainer.evaluate(test_data)
. But the LB score was only 0.0124.To Reproduce
submission.csv
Expected behavior I'll get much better score. The map@12 is 0.148. My model learned 450,255 users' interactions. And thereāre 1,371,980 users in the submission file. If Iām correct, the LB score will be about 0.0485( = 0.148 * 450255 / 1371980).
My guess This gap may coming from these below.
full_sort_topk
**Additional information I've already open a discussion on Kaggle. Here's my original post.
If "details" tags are not recommended here, please let me know. I'll fix it.
It's long. Please click to expand
Hello everyone. I made a [notebook](https://www.kaggle.com/code/peterpetrov826/fork-of-using-recbole/notebook) which uses RecBole. Surprisingly, I got map@12 = 0.148 on eval set by running trainer.evaluate(test_data). But the LB score was only 0.0124. Do you have an idea where does this huge gap come from? As you may know, [RecBole](https://recbole.io/) is an open-source recommendation library. Itās a kind of wrapper of PyTorch and you can build about 80 models easily. Let me share my strategy. Itās difficult to recommend for users who rarely shop. So I extracted users who have bought more than 2 times and use them to train my model. For other users, I recommend popular products. In general, popular products are more likely to be bought, and those that don't will not. So I extract product which have been bought more than 50 times and use them to train my model. For other products, I donāt recommend those at all. (Sorry sewing geniuses) The map@12 is 0.148. My model learned 450,255 users' interactions. And thereāre 1,371,980 users in the submission file. If Iām correct, the LB score will be about 0.0485( = 0.148 * 450255 / 1371980). I must have made a mistake somewhere. Iām wondering that thereāre some problems in my āmaking recommendationsā section. I found [this awesome notebook](https://www.kaggle.com/code/astrung?scriptVersionId=91596049&cellId=35) may improve my score. But still there will be a huge gapā¦ And my another guess is that due to my strategy, the evaluation process was done in āsuper easy modeā, but submission process is āextremely hard modeā. Thanks for reading and taking time!