Closed kyriemao closed 5 months ago
this looks a bit weird. what is your batch size/ training group size setting?
this looks a bit weird. what is your batch size/ training group size setting?
The param settings are:
I use 6 A100 40G GPUs for training.
Solved. It is because of my own bug about processing the EOS token. Thanks!
Hello, I met the same problem. Can you please tell me how do you solve it? Thank you a lot!
Hi Xueguang,
Great work! I am training my own RepLLaMA now and find that the training loss starts from 90+ and quickly drops below 0.1 in around 30 steps (as shown below). Is it normal or could please provide your training log of RepLLaMA?
Thanks!