Closed duyvuleo closed 3 years ago
Hi!
Yes, the config was used for the DG-MAML experiments where the number of training steps is controlled for comparison. To obtain a stronger base parser, you can either increase the number of training steps or increase num_batch_accumulated to 2 or 3, and activate label_smooth for loss_type.
Hope it helps
Hi,
I run the base model by the following command:
After 20K training steps, I got the performance as follows:
Step: 19000 match score, 0.6624758220502901 exe score: 0.6702127659574468 Step: 20000 match score, 0.660541586073501 exe score: 0.6789168278529981
Does it make sense? I can see your reported number in the paper (https://arxiv.org/abs/2104.05827) (around >70). Should I need to train for more steps, says 80K-100K?
Thanks!