About negative item sampling for a given user in def _get_train_batch(i)

daiquanyu commented 6 years ago

# negative pairs
for dns in range(_model.dns):
    user = _user_input[_index[idx]]
    user_neg_batch.append(user)
    # negative k
    gtItem = _dataset.testRatings[user][1]  # why testing pairs info can be used?
    j = np.random.randint(_dataset.num_items)  # random sample an item
    # j should not be equal to gtItem: sample a negative not in test set? Why?
    while j in _dataset.trainList[_user_input[_index[idx]]] or j == gtItem:
        j = np.random.randint(_dataset.num_items)                            
    item_neg_batch.append(j)

My question is: Why the observed user-item information in the testing set can be used for negative sampling in training? Thanks.

hexiangnan commented 6 years ago

In training, you are supposed to know nothing about the testing set...

On Sat, Sep 8, 2018 at 10:59 AM wonniu notifications@github.com wrote:

negative pairs for dns in range(_model.dns): user =

_user_input[_index[idx]] user_neg_batch.append(user) # negative k gtItem = _dataset.testRatings[user][1] # why testing pairs info can be used ? j = np.random.randint(_dataset.num_items) # random sample an item while j in _dataset.trainList[_user_input[_index[idx]]] or j == gtItem: # positive items or in test-set ? j = np.random.randint(_dataset.num_items) item_neg_batch.append(j)

My question is: Why the observed user-item information in the testing set can be used for negative sampling in training? Thanks.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/hexiangnan/adversarial_personalized_ranking/issues/1, or mute the thread https://github.com/notifications/unsubscribe-auth/ABGxjlCzDZD0V-ZjM4QhcB3ZpsHWp6Xjks5uYzKPgaJpZM4WfsHC .

daiquanyu commented 6 years ago

In this sampling function, the sampled negative item for the given user is constrained to be not the positive user-item in the testing set. In my understanding, the info in the testing set has been used? I think this constraints should not exist?

I don't know whether it is common setting in recommendation. I'm new in this area. Thanks for your kind reply.

hexiangnan commented 6 years ago

That seems to be a mistake. Let me check with the coder.. Thanks for pointing it out. On Sat, Sep 8, 2018 at 11:11 AM wonniu notifications@github.com wrote:

In this sampling function, the sampled negative item for the given user is constrained to be not the positive user-item in the testing set. In my understanding, the info in the testing set has been used? I think this constraints should not exist?

I don't know whether it is common setting in recommendation. I'm new in this area. Thanks for your kind reply.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

Coder-Yu commented 5 years ago

I am also confused with this part. I suppose this is a mistake as the test set should be invisible during training.

hexiangnan commented 5 years ago

@wonniu @Coder-Yu Thank you for pointing this issue out. This is a bug in the previous implementation. We have fixed it and confirmed that this bug has negligible impact on the final results.

hexiangnan / adversarial_personalized_ranking

About negative item sampling for a given user in def _get_train_batch(i) #1

negative pairs for dns in range(_model.dns): user =