Hyper parameters for pretraining

shizhediao / DaVinci

Source code for the paper "Prefix Language Models are Unified Modal Learners"

BSD 3-Clause "New" or "Revised" License

42 stars 3 forks source link

Hyper parameters for pretraining #3

Open williamium3000 opened 1 year ago

williamium3000 commented 1 year ago

Hi, nice work! Can you release the hyper parameters for pretraining, batch size, number of gpus etc. It would be nice to have a hyper parameters for each of the different datasets combinations (e.g. ID, ID+SWD, etc table)

Thanks