RLHF-V / RLHF-V

[CVPR'24] RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback
https://rlhf-v.github.io
197 stars 6 forks source link