vietai / ViT5

MIT License
59 stars 9 forks source link

Pretraining #11

Closed haisonle001 closed 1 year ago

haisonle001 commented 1 year ago

Hi,

I want to pretrain viT5 on another Vietnamese dataset (P3 dataset, for example). Can I have the pretraning script your team use to pretrain on CC100?

Thank you in advance.

justinphan3110 commented 1 year ago

We migrated all pretraining scripts to another branch as it can be outdated. You can take a look at it here: https://github.com/vietai/ViT5/tree/mesh_tf1/pretraining_mesh

enpassanty commented 1 year ago

I have a custom dataset that I need to pretrain and then finetune with long context (16.384). I was thinking of using some of your pretraining scripts above but adapting them for longT5. do you think this is doable? do you have any advice on this? thank you