Closed nguyenquangtan closed 1 week ago
As far as I known, when training a large model, It needs Largerer batsh size, Lower learning rate, More Data And just one Epoch. I always following this.
Thank you for your information. May I ask you some additional questions:
Hi, I am trying to replicate the Chart-to-Table Translation stage as described in your paper. It's my first time pre-training such a big model on a large-scale dataset so I don't know when I should stop the pre-training loop. Could you please kindly provide the information about the number of epochs and the best loss that you pre-trained both ChartAst-D and ChartAst-S models in the Chart-to-Table Translation stage.
Thank you for your attention to this matter.