Open ani0075saha opened 1 year ago
Hi @ani0075saha
We had some experiments with T5. The code in another branch: https://github.com/INK-USC/CrossFit/tree/fo
I think we customized the huggingface T5 a bit in t5.py
so that it works with the codebase we originally developed for BART models.
There is one example script of model fine-tuning here: https://github.com/INK-USC/CrossFit/blob/fo/example_scripts/finetune_a_list_of_tasks_t5base.sh
I'm not familiar with nanoT5. I hope this is helpful.
Hi @cherry979988,
Thanks for sharing the implementation for your benchmark. I was able to run the BART direct-finetuning on GLUE-SST2 and get 0.86 accuracy.
I switched the model to T5. I follow the model definitions from nanoT5, but I am not being able to finetune T5 (high losses, zero accuracy). I was wondering is there any BART specific pre-processing which I need to modify to be able to work with T5? Any help would be greatly appreciated. If you can share a corresponding T5 finetuning script, that would be great.