Do you have plans to add fine-tuning scripts for other multimodal large models? For example, Qwen_VL, LLaVA1.6, MiniGPT4, etc.

hiyouga / LLaMA-Factory

Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)

https://arxiv.org/abs/2403.13372

Apache License 2.0

31.01k stars 3.82k forks source link

Do you have plans to add fine-tuning scripts for other multimodal large models? For example, Qwen_VL, LLaVA1.6, MiniGPT4, etc. #4174

Closed xdaiycl closed 2 months ago

xdaiycl commented 3 months ago

Reminder

[X] I have read the README and searched the existing issues.

System Info

None

Reproduction

None

Expected behavior

None

Others

None

BUAADreamer commented 3 months ago

We are working in #4136 to support some sota MLLMs with MLP-Connector like LLaVA-Next(1.6)/Idefics2/Video-LLaVA/LLaVA-Next-Video For MLLMs with a q-former connector like BLIP2/Instruct-BLIP/Qwen-VL/MiniGPT-4, we will not support them for now.