如何使用自己的数据集微调MoE-LLaVA

PKU-YuanGroup / MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models

https://arxiv.org/abs/2401.15947

Apache License 2.0

1.9k stars 121 forks source link

Open Tunanzzz opened 5 months ago

Tunanzzz commented 5 months ago

train.md的脚本只是训练三个阶段的脚本，如何使用自己的数据集微调已经预训练出的第三个阶段的模型

zhengxingmao commented 5 months ago

+1 同问

CharlieFRuan commented 4 months ago

可以复用train.py，然后把比如MoELLaVAStablelmForCausalLM替换成EvalMoELLaVAStablelmForCausalLM，后面就不用initialize_moe_modules()了；然后根据需要来requires_grad_()

chunhuizng commented 3 months ago

可以复用train.py，然后把比如MoELLaVAStablelmForCausalLM替换成EvalMoELLaVAStablelmForCausalLM，后面就不用initialize_moe_modules()了；然后根据需要来requires_grad_()

你好，请问你有相关的已实现的代码可以分享嘛？只是问一下，谢谢！

maodou-shushu commented 2 months ago

可以复用train.py，然后把比如MoELLaVAStablelmForCausalLM替换成EvalMoELLaVAStablelmForCausalLM，后面就不用initialize_moe_modules()了；然后根据需要来requires_grad_()

你好，请问有实践的blog吗，谢谢