enoche / MMRec

A Toolbox for MultiModal Recommendation. Integrating 10+ Models...
GNU General Public License v3.0
367 stars 46 forks source link

I have some confusion about regarding user interacted items as positive samples #9

Closed XuHao-bit closed 1 year ago

XuHao-bit commented 1 year ago

In the open-source datasets, such as the Baby dataset, I have observed that there are interaction records with very low ratings in the baby.inter (e.g., 6 5374 1.0 1374019200 2 has a rating of 1). But I see that in the metric calculation part of the codes, the user interacted items are considered positive samples and the items that have not interacted are considered negative samples. However, the current scenario will also consider some low-rated items as positive samples. Does this situation of treating low-rated items as positive samples introduce bias? Does it affect the model training process?

enoche commented 1 year ago

Does this situation of treating low-rated items as positive samples introduce bias? Does it affect the model training process? Both answers are yes.

Hi, @XuHao-bit Thanks for your feedback. In this repo, we follow previous research settings by treating all interactions as implicit feedback (0/1), all ratings are discretized into 0/1. There are some other research papers that only keep ratings >= 3 or ==5 as positive samples. You may train the our models under different settings and analyze the results. We are expecting your experimental reports. Thanks.