About the hyperparameters of finetuning t5-base

Hi @gizacard ,

Thanks for your awesome project. And I just want to know the hyperparameters of finetuning T5-basa.

You have only shared the T5-large's hyper in the tutorial as followings, could you share T5-base's as the former's ?

python train_reader.py \
        --use_checkpoint \
        --lr 0.00005 \
        --optim adamw \
        --scheduler linear \
        --weight_decay 0.01 \
        --text_maxlength 250 \
        --per_gpu_batch_size 1 \
        --n_context 100 \
        --total_step 15000 \
        --warmup_step 1000 \

Thanks, looking forward to your reply.

facebookresearch / FiD

About the hyperparameters of finetuning t5-base #11