DAMO-NLP-SG / VCD

[CVPR 2024 Highlight] Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding
Apache License 2.0
177 stars 8 forks source link

About the Experimental Setup #15

Closed haohaodw closed 3 weeks ago

haohaodw commented 1 month ago

I am very interested in your excellent work. I have some questions about the experiments. What temperature were the Regular and VCD in Table 1 and Table 2 tested? Looking forward to your reply !

LengSicong commented 3 weeks ago

We apply direct sampling without any temp normalization in our main experiments. For more ablations on decoding strategies, please refer to B.4. Effect of Different Sampling Strategies in the Appendix.

haohaodw commented 3 weeks ago

Thanks for your reply. When we use the default parameter settings in the code experiments/eval/object_hallucination_vqa_llava.py, does this mean that vcd is performing direct decoding? Is this default parameter setting consistent with the settings of the main experiment in the paper?

LengSicong commented 3 weeks ago

yes, it is doing direct sampling.

haohaodw commented 3 weeks ago

Thanks for your reply. How to use VCD in greedy search? What code do I need to modify?