junkangwu / alpha-DPO

https://arxiv.org/abs/2410.10148
MIT License
12 stars 0 forks source link

多模态大模型(MLLM) #2

Closed HZWHH closed 1 week ago

HZWHH commented 1 week ago

请问这个方法能用到多模态大模型(MLLM)吗?

junkangwu commented 1 week ago

应该是可以的,alpha-DPO 可以看成DPO的一种变种,只要在DPO4MLLM场景中,将其损失函数更换为 alpha-DPO 即可

HZWHH commented 1 week ago

好的,谢谢