多模态和单模态数据的混合方式

X-PLUG / mPLUG-Owl

mPLUG-Owl: The Powerful Multi-modal Large Language Model Family

https://www.modelscope.cn/studios/damo/mPLUG-Owl

MIT License

2.25k stars 171 forks source link

多模态和单模态数据的混合方式 #145

Open JustQJ opened 1 year ago

JustQJ commented 1 year ago

论文中提到了第二阶段的微调使用了多模态(llava)和单模态数据(alpaca, vicuna, baize)，想问一下这两种数据是混在一起训练的还是单独分开训练的。如果是混在一起，那么每个batch中的单模态和多模态数据的比例是怎么确定的。谢谢！

MAGAer13 commented 1 year ago

We random mix the text data and multi-modal data. For each batch, we do not control the ratio, it just random sampled, and the ratio within a batch would similar to the ratio within the dataset.