DPO训练 - Githubissues

TideDra / VL-RLHF

A RLHF Infrastructure for Vision-Language Models

Apache License 2.0

100 stars 6 forks source link

DPO训练 #16

Closed XxxZzD closed 1 month ago

XxxZzD commented 1 month ago

请问使用LLaVA进行DPO支持多图输入吗？比如两张图片