X-PLUG / mPLUG-Owl

mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
https://www.modelscope.cn/studios/damo/mPLUG-Owl
MIT License
2.25k stars 171 forks source link

cur_input_embeds = torch.cat([cur_input_embeds_1, cur_image_features[0:0], cur_input_embeds_2], dim=0),其中cur_image_features[0:0]表示这是一个没有维度的向量,图像的特征并没有真正加进去 #193

Open hangzeli05 opened 9 months ago

hangzeli05 commented 9 months ago

mPLUG-Owl2中的代码错误

vateye commented 9 months ago

No, it is for compatible with deepspeed zero3 during training on text samples. For multi-modal input, this would not encounter.