Open RanFeng2 opened 7 months ago
After I adjusted the learning rate, NaN values no longer appear. However, I've noticed that the loss remains significantly higher than expected, even after the NaN values no longer appear. Could you please offer any insights or suggestions on potential steps I could take to reduce the loss further?
Hi~ Thanks for your interest in my work. You can troubleshoot from the following aspects:
Thank you for your work on this project. It's been incredibly helpful for me.
However, I've encountered a challenge that I hope to seek your advice on. During the training process of student model, I observe that the loss value turns into NaN at the first epoch, which has been quite perplexing.
Thank you in advance for your time and help.