Add VLFeedback, the first DPO dataset for LVLMs

TobiasLee commented 8 months ago

Hi authors, thanks for your great survey and the curated paper list, which has been highlighted as a reference in our related work for a detailed introduction to LVLMs.

I'd like to recommend our recent work exploring DPO for LVLMs to the list (Dataset part or maybe even a new RLHF section, including RLHF-V, LLaVA-RLHF as well? ). Based on our VLFeedback dataset annotated with GPT-4V on 80k high-quality multi-modal instructions, we found DPO is promising on benchmarks such as MME and MMHal-bench.

Project Page: https://vlf-silkie.github.io/
Paper: https://arxiv.org/abs/2312.10665
Dataset: https://huggingface.co/datasets/MMInstruction/VLFeedback
Code: https://github.com/vlf-silkie/VLFeedback

xjtupanda commented 8 months ago

Thanks for your sharing and your citation! We've incorporated your work into our repo.

TobiasLee commented 8 months ago

Thank you for having our work! :tada::tada: We have provided the GPT-4V annotation guide in the main paper and appendix. We will upload it to the code repo soon.

BradyFU / Awesome-Multimodal-Large-Language-Models

Add VLFeedback, the first DPO dataset for LVLMs #101