A few questions in your paper and code

SAI990323 / TALLRec

Apache License 2.0

199 stars 32 forks source link

You have mentioned in your paper that the learning rate is 1e-3 but which in instruct_7B.sh is 1e-4, which is exactly the true parameter you used?
The optimizer funciton in the paper is Adam but AdamW in training file.
The loss function in the paper is MSE, but as far as I know, most LM using CrossEntropy loss, and I believe I didn't find where you defined the loss function, but instead found the model's predefined CrossEntropy loss function inside the LLaMA.

I don't know if I didn't find the right file or if your code or paper needs to be updated

SAI990323 / TALLRec