Use of `pos` and `neg` vectors in training

pmixer / SASRec.pytorch

PyTorch(1.6+) implementation of https://github.com/kang205/SASRec

Apache License 2.0

331 stars 90 forks source link

Use of `pos` and `neg` vectors in training #12

Closed jeffreymei closed 3 years ago

jeffreymei commented 3 years ago

https://github.com/pmixer/SASRec.pytorch/blob/master/utils.py#L32 Could you please explain the reason why the pos vector is the same as the input sequence but offset by one timestep? Is it attempting to learn to predict each of the items in pos and not just the final one?

pmixer commented 3 years ago

Hi Jeffrey @jeffreymei ,

Yes, exactly. The pos vector +1 timestep is used for setting the groundtruth for next-item-prediction.

As the decoder of the transformer model could only see the data of t <= current_timestamp, we can do multiple predictions for each timestep's next-item at one pass. Thus make training more efficient.

Meanwhile, SASRec only used last state of model outputs(model(input)[:, -1, :]) for evaluation and testing. You may try different loss function or even new training/testing paradigm to further improve it.

Regards, Zan

johnny12150 commented 3 years ago

Quote the paper:

For each user u, we randomly sample 100 negative items, and rank these items with the ground-truth item. Based on the rankings of these 101 items, Hit@10 and NDCG@10 can be evaluated.

How could I set the number of negative items in the code? It seems the negative sampling is only based on https://github.com/pmixer/SASRec.pytorch/blob/d192a40203ea1391410f71445b863fc3cc3e2620/utils.py#L33

The negative amount should be not determined.

pmixer commented 3 years ago

@johnny12150 yes, negative sample number should not be fixed for flexibility, well, as the paper mentioned that negative item number got fixed(100), it's okay if the code just for experimental use have it fixed. If you prefer to use a different negative item number, pls search 100 in current code(like in https://github.com/pmixer/SASRec.pytorch/blob/d192a40203ea1391410f71445b863fc3cc3e2620/utils.py#L135) to change it or make it as a new variable. BTW, I even prefer not to use negative sampling for more fair model evaluation, pls check https://github.com/pmixer/TiSASRec.debug if interested in evaluating the model without negative sampling.

johnny12150 commented 3 years ago

@pmixer Thks!

BTW, I even prefer not to use negative sampling for more fair model evaluation, pls check https://github.com/pmixer/TiSASRec.debug if interested in evaluating the model without negative sampling.

This is exactly what I want.

edit Is the pytorch version of TiSASRec also support non-negative sampling? We have no control to limit the number of negative items to be used in training, right?

pmixer commented 3 years ago

@johnny12150 https://github.com/pmixer/TiSASRec.debug/tree/pytorch pytorch

concerning 'We have no control to limit the number of negative items to be used in training, right', we can just modify few lines of code to transfer from using a fixed negative sample number to user-defined dynamic negative sample number, but we need to do it ourselves if in need of this feature.

johnny12150 commented 3 years ago

Since the training part won't use these codes. https://github.com/pmixer/SASRec.pytorch/blob/d192a40203ea1391410f71445b863fc3cc3e2620/utils.py#L135-L138

I think we need to modify it in the https://github.com/pmixer/SASRec.pytorch/blob/d192a40203ea1391410f71445b863fc3cc3e2620/utils.py#L33 to make sure it generates all possible pairs, right?