wangclnlp / Vision-LLM-Alignment

This repo contains the codes for supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) designed for vision LLMs.
39 stars 1 forks source link

cogvlm2 support #3

Closed kaka-Cao closed 1 month ago

kaka-Cao commented 1 month ago

您好!最近基于cogvlm2多模态大模型训练了一个我们自己的垂类大模型,想在SFT之后进行强化学习训练,不知道此仓库是否支持cogvlm2的DPO训练,或者能否和您交流如何实现这一目的。

wangclnlp commented 1 month ago

Thank you for your attention! Please contact me (email: clwang1119@gmail.com or wechat:wly12110519), and we could assist you in using our system to support the alignment training of this model.