Closed youngzw closed 2 years ago
By the way, I wonder why the performance of SASRec model is higher than the original paper. Here is the result with ml-1m
dataset:
the original code (I re-run it):
sample 99 negative items:
sample 100 negative items:
screenshot of the results reported in the original paper (sample 100 negative items):
rechorus
(default sample 99 negative items):
when ranking over all the items with --test_all 1
:
the original code (I modify it):
rechorus
:
I checked the original code of SASRec and find it adopts a quite different training paradigm. In our framework, a sequence with length 200 will be fed into the forward function 199 times. Each time the input includes the target item and corresponding history sequence. This is easier to understand and flexible to design more complex models (similar implementations can be found in RecBole). However, the sequence will only be encoded once in the original code of SASRec. It will generate 200 logits corresponding to each position and use 199 of them to calculate the loss. This is far more efficient but requires the model to be able to output all the logits simultaneously. The difference in training paradigm might also lead to inconsistent performance.
Hi THUwangcy.
I use your lib to run the SASRec model with command as below (cuda environment):
The parameters are same with the original paper SASRec. But I find the training speed is much slower than the original code for one epoch, even I modify the code to avoid inferring at every epoch. https://github.com/THUwangcy/ReChorus/blob/9a4a783de1fdd02292fb95bc3471b8d310d2110a/src/helpers/BaseRunner.py#L121-L126 Can you provide some solutions to this problem?