Open YifanHao opened 10 months ago
Sorry for the late reply.
Within the training file for the baseline, there are three sections commencing with if __name__=='__main__':
, with the first two sections commented out. The initial section (here), which is commented out, contains the hyperparameter tuning code, offering default values for parameters such as batch size.
Hi! I've read the paper and found it very interesting so I'm trying to reproduce it. However, I'm kind of stucked in the very first step to train the base collab. model. I used
baseline_train_sasrec_amazon.py
andbaseline_train_mf_ood_amazon.py
to train SASRec and MF model seperately, with the hyperparameters in the scripts unchanged. (except batch_size: 10240 -> 1024 inbaseline_train_sasrec_amazon.py
, I thought it might be a typo since the value is always 1024 inbaseline_train_sasrec.py
,baseline_train_mf_ood.py
andbaseline_train_mf_ood_amazon.py
) But my result is much lower than those reported in the paper. So I wonder if it's the hyperparameters in the scripts are just for experiments and not the optimal values? Or something I didn't notice may cause the gap?The training logs produced by my run are as follows:
Thanks for you advices!
Update: It seems setting
batch_size
to 10240 make sense for SASRec on Amazon dataset. My result is close to the value reported in the paper after doing so. For MF model,weight_decay
seems to be the key parameter, the performance boosts after I set it to 1e-5.