some question about data(Each line corresponds to the line of test.rating, containing 99 negative samples. But why 99?)

hexiangnan / neural_collaborative_filtering

Neural Collaborative Filtering

Apache License 2.0

1.8k stars 655 forks source link

some question about data(Each line corresponds to the line of test.rating, containing 99 negative samples. But why 99?) #62

Open yifannir opened 4 years ago

yifannir commented 4 years ago

according to the describtion "Each line corresponds to the line of test.rating, containing 99 negative samples.", why negative sample num is 99, when the number is small ,the perfermance will be well of course, I want to know should it be all the negative number? Thank you

KylinA1 commented 4 years ago

It should be compared among all the items. But as the author demonstrated in paper,

since it is too time consuming to rank all items for every user during evaluation, we ... randomly samples 100 items that are not interacted by the user

Although the ml-1m and Pinterest datasets are actually pretty small ..... PS: almost all related paper said they sample 100 negative items, but in fact that is 99.

Chuan1997 commented 4 years ago

It should be compared among all the items. But as the author demonstrated in paper,

since it is too time consuming to rank all items for every user during evaluation, we ... randomly samples 100 items that are not interacted by the user

Although the ml-1m and Pinterest datasets are actually pretty small ..... PS: almost all related paper said they sample 100 negative items, but in fact that is 99.

so in order to fully reproduce the result of paper, we need to set num_neg to 99?

beathahahaha commented 4 years ago

I don't think the data profile is right. The number of negative items should be 100 instead of 99. (original article:"we followed the common strategy that randomly samples 100 items that are not interacted by the user, ranking the test item among the 100 items.") And I suggest to read this paper: Self-Attentive Sequential Recommendation, 2018, ICDM, which made a clear declaration.