Closed CBHhD closed 1 year ago
Hi, I found that training the models for 40k steps is enough in most settings and training it longer can sometimes degrade the performance. I think you can start with 40k steps and see if the performance is good enough
Thank you very much !
Hello, I'm interested in your work. While i'd like to know the training parameters for the deen dataset. It seems too big, so the max_step 40000 seems not use all the deen training data (it have about 1900000 data, equal to 237550 steps if the total batch size ==8 as the scripts show ) .So i'd like to know the extract parameters for the deen training such as batch size ,real training steps,thanks!