Open lipingcoding opened 3 years ago
btw @lipingcoding could u pls provide bit more information like log etc. to describe the observation that the model do not converge? Like uncomment https://github.com/pmixer/SASRec.pytorch/blob/30c43cf090d429480339ab18d43354b3e399bc29/main.py#L93 would print loss for iteration which should be informative.
@lipingcoding try the newly updated code?
@lipingcoding thx for reporting the issue, I also observed some problems and still wondering what's wrong with pytorch version compared with tf implementation. Sorry to say but I still haven't figured it out yet. The only thing I can be sure currently is that original paper's hyperparameter setting could be be directly used for this codebase, as I fixed some leaky attention issue by using PyTorch's MHA, the parameter initialization issue still need to be elaborated but I haven't done it yet.