Analysis of the hallucination benchmark result in Appendix of your paper

foundation-multimodal-models / CAL

[NeurIPS'24] Official PyTorch Implementation of Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment

Apache License 2.0

48 stars 2 forks source link

Hi, I apologize for the delayed reply as I am currently occupied with graduation preparations and related travels.

Thanks for your kind opinion. In my view, the POPE benchmark may not be optimal for evaluating hallucination due to its excessively high scores and minimal variability. Alternative benchmarks may indeed be more suitable for these assessments (for more information, please refer to https://arxiv.org/pdf/2312.00849). After my vacation, I will augment the evaluation results from these related benchmarks if possible.

foundation-multimodal-models / CAL

Analysis of the hallucination benchmark result in Appendix of your paper #1