OpenBMB / MiniCPM-V

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
Apache License 2.0
11.88k stars 836 forks source link

In Technical Report #421

Closed univa-JASON closed 3 weeks ago

univa-JASON commented 1 month ago

How did you proceed with DPO learning? Using CPMTrainer, or HF DPOTrainer? Does CPM Trainer support DPO finetuning?

yiranyyu commented 1 month ago

We write the training code based on the RLAIF-V project. The code implement a trainer for DPO by itself.

univa-JASON commented 1 month ago

thank you for your answer! i have 1 more question, Can I apply wsd scheduler in This repo's finetuning code?