Help with reproducing T5-3b number on Spider

yeliu918 commented 2 years ago

Hi,

I'm trying to reproduce the Table2 ST number with T5-3B on Spider. I'm using the following command on 16 A100 GPUs:

deepspeed train.py --deepspeed deepspeed/ds_config_zero2.json --seed 2 --cfg Salesforce/T5_3b_finetune_spider_with_cell_value.cfg --run_name T5_3b_finetune_spider --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 250 --metric_for_best_model avr --greater_is_better true --save_strategy steps --save_steps 250 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 8 --num_train_epochs 80 --adafactor false --learning_rate 5e-5 --do_train --do_eval --do_predict --predict_with_generate --output_dir output/T5_3b_finetune_spider --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --generation_num_beams 4 --generation_max_length 128 --input_max_length 1024 --ddp_find_unused_parameters true

Does it look right? I get 68.83 using this command. Could you help me with the command that can reproduce 71.76 on Spider? Thanks!

ChenWu98 commented 2 years ago

Hi, can you try --generation_num_beams 1 instead of --generation_num_beams 4? It should produce better results.

ChenWu98 commented 2 years ago

I'll close this issue. If you have further questions, feel free to re-open it!

xlang-ai / UnifiedSKG

Help with reproducing T5-3b number on Spider #35