Open tomerwolgithub opened 3 years ago
Varying the batch size your way shouldn't change performance because the effective batch size is kept at 32. Would you mind sharing your training curve or the accuracy change log?
Sure thing. Below are the EM performance curves for all four of the models I tried to train. The top performing model (with batch size 8 and accum steps 4) managed to score 67.1%:
Please let me know if there's any other info that might help. Thanks!
The difference might be caused by the Data Repair step described in section 4.3 of the paper.
I've modified the data processing steps to incorporate this step, please git pull
and follow the instructions here:
https://github.com/salesforce/TabularSemanticParsing#spider
I got the same EM accuracy on spider dev 66.7% as well. I pulled the latest repo last week and did the data repair. My config is the exact config, except the training and dev batch size=8 instead of 16 and 24.
Shall I use some different configurations to achieve 70.0%?
First of all, thank you for sharing this terrific work. I found it really straightforward to plug and start training.
However, when training Bridge-L (with BERT large) on Spider I'm unable to reach 70.0% EM on the dev set. My results keep peaking at around 66.7%. I'm training on GeForce RTX 3090 and the default setting with batch size 16 was too much for it's 24GB mem. So, I've tried out a few runs with batch sizes 8, 4, 2 and accum steps 4, 8, 16 respecitvely (other hyperparameters are the default ones). All these runs ended up capping at <67% after more than 100K steps.
I was wondering whether this is simply the result of the different hardware and my batch sizes being smaller than 16, or am I missing something?
Thanks!