OpenBMB / MiniCPM-V

MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone
Apache License 2.0
7.82k stars 543 forks source link

zero3支持 #273

Open qyc-98 opened 2 weeks ago

qyc-98 commented 2 weeks ago

修改了huggingface上我们模型的resampler和minicpmv模型文件,这个pr需要和huggingface的pr一起提交,主要解决了目前zero3需要强制聚拢参数的问题,以及主模型运行时某些变量没有及时被deepspeed发送到对应的显卡上导致minicpmv2不能使用zero3算法微调

whyiug commented 6 days ago

hey, i wonder why is this PR in pending.