yonseivnl / vlm-rlaif

ACL'24 (Oral) Tuning Large Multimodal Models for Videos using Reinforcement Learning from AI Feedback
Apache License 2.0
52 stars 3 forks source link

Add RLAIF training code #3

Closed Yuuraa closed 5 months ago