Closed hohoCode closed 6 months ago
Flan-T5 is a finetuned version of T5 -> so pretraining it wouldn't make sense since you are essentially removing the benefits of Flan, correct?
This is correct @IdeaKing , thanks for answering!
I mean, to pretrain and get Flan-t5? Sure you have t5, then if this is possible to finetune it to get Flan-t5 that would be awesome.
Yeah, this is what the repo does too. What I did and tested is fine tuning on Natural Instructions which is a subset of Flan, but instead you can download the entire Flan from HF and fine tune it using the config used for Natural Instructions. I would say that it should roughly work straight away. If it doesn’t then you should look up the fine tuning hyperparameters from the Flan paper.
Thanks!
First of all, great work! Thanks for sharing.
Just wondering if this is possible to train Flan-T5 from scratch, any thoughts or ideas on this?
Thanks!