Open yifannir opened 4 years ago
It should be compared among all the items. But as the author demonstrated in paper,
since it is too time consuming to rank all items for every user during evaluation, we ... randomly samples 100 items that are not interacted by the user
Although the ml-1m and Pinterest datasets are actually pretty small ..... PS: almost all related paper said they sample 100 negative items, but in fact that is 99.
It should be compared among all the items. But as the author demonstrated in paper,
since it is too time consuming to rank all items for every user during evaluation, we ... randomly samples 100 items that are not interacted by the user
Although the ml-1m and Pinterest datasets are actually pretty small ..... PS: almost all related paper said they sample 100 negative items, but in fact that is 99.
so in order to fully reproduce the result of paper, we need to set num_neg to 99?
I don't think the data profile is right. The number of negative items should be 100 instead of 99. (original article:"we followed the common strategy that randomly samples 100 items that are not interacted by the user, ranking the test item among the 100 items.") And I suggest to read this paper: Self-Attentive Sequential Recommendation, 2018, ICDM, which made a clear declaration.
according to the describtion "Each line corresponds to the line of test.rating, containing 99 negative samples.", why negative sample num is 99, when the number is small ,the perfermance will be well of course, I want to know should it be all the negative number? Thank you