X-PLUG / mPLUG-Owl

mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
https://www.modelscope.cn/studios/damo/mPLUG-Owl
MIT License
2.25k stars 171 forks source link

code and dataset for pretrain #70

Closed qiuhuiGithub closed 1 year ago

qiuhuiGithub commented 1 year ago

hi, will you released code and dataset for pretrained?

MAGAer13 commented 1 year ago

For pretraining dataset, you can refer to Laion400M and COYO-700M which is publicly available.

For pre-training, we have no time to process it, you may can refer to the finetuning scripts which is similar except for the prompt. Or you can directly using the checkpoint we provided which we recommend.