YiyangZhou / POVID

[Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuning
Apache License 2.0
71 stars 3 forks source link