InternLM / xtuner

An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
https://xtuner.readthedocs.io/zh-cn/latest/
Apache License 2.0
3.91k stars 305 forks source link

Does xtuner support DPO for InternVL? #943

Open fabriceyhc opened 3 weeks ago

fabriceyhc commented 3 weeks ago

I am trying to do a custom DPO fine-tuning for internvl_v2_internlm2_2b_lora_finetune, but the default config is oriented towards vanilla supervised fine-tuning with images. I tried to compare / incorporate changes from internlm2_chat_1_8b_dpo_full but am running into some issues with the dataset formats supported.

Is this something that xtuner actually supports at the moment?

hhaAndroid commented 3 weeks ago

https://github.com/hhaAndroid/xtuner/blob/hha_0919/my_llava/README.md#dpo