Open yueluhhxx opened 4 months ago
Hi, The increase in loss is normal because we use scheduled sampling to train the model. The loss will increase as the sampling rate increases. You just need to make sure that the SR and SPL converge.
Hi! Thank you for your answer! After fine-tuning the model on r2r-ce, I evaluated it and found that the success rate was only about 46%. The tasks in the pre-training phase are only MLM and SAP, with 60000 training rounds and the fine-tuning training rounds are 10000. What can I try to improve success rates? Do I need to increase the number of training rounds?
Hi, sorry, I have no idea about this. But I think you could first evaluate the released checkpoint, or use the released pre-trained checkpoint to fine-tune the policy, to exclude the potential problem raised by your machine or CUDA version.
Thank you for your reply! I guess it's because I only used two 3090 GPUs and the batch_size decreased, resulting in a performance gap
I encountered convergence issues while fine-tuning my pre-trained model following the code for r2r-ce.
I did not modify the pre-trained and fine-tuning settings. After 60000 pre-trained steps, I used the model_step_60000.pt path as the 'MODEL. retrained_path' for fine-tuning. But during fine-tuning, it was found that the loss decreased from 1.57 to 1.08 and then began to rise, then began to fluctuate and could not converge.
I have no idea how to solve that convergency problem. Was there a problem with any of my steps? Besides, during the pre-trained phase, I found that two types of ckpt files were generated: model_step.pt and train_state.pt. I did not use the train_state.pt, and I want to know what the purpose of this file is.
Your work is always great! I'm looking forward to hearing back from you! Thanks a lot!