Closed toke1220 closed 2 years ago
Hello! Thank you for your excellent work! I train the model from scratch on a GPU. I set bs to “4”. But the train_loss hardly coverages in the first few epochs(5 epoch). I want to know if this situation is normal. Is it because of the value of bs?
I changed the equipment and set bs to ‘8’(the same as train.sh), and the convergence situation was normal. Maybe bs has to be above 8.
Hello! Thank you for your excellent work! I train the model from scratch on a GPU. I set bs to “4”. But the train_loss hardly coverages in the first few epochs(5 epoch). I want to know if this situation is normal. Is it because of the value of bs?