Open jeffreymei opened 3 years ago
https://github.com/pmixer/SASRec.pytorch/blob/72ba89d4a1d0319389ef67ee416e33b7431c8b9b/model.py#L85
Can you explain what this line does? Why is the attention output being added to Q?
skip connection like the one used in ResNet, here's how they implemented in tf by the paper author:
https://github.com/kang205/SASRec/blob/e3738967fddab206d6eeb4fda433e7a7034dd8b1/modules.py#L219
https://github.com/pmixer/SASRec.pytorch/blob/72ba89d4a1d0319389ef67ee416e33b7431c8b9b/model.py#L85
Can you explain what this line does? Why is the attention output being added to Q?