unable to reproducing the results of llava

DAMO-NLP-SG / VCD

[CVPR 2024 Highlight] Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding

Apache License 2.0

196 stars 9 forks source link

unable to reproducing the results of llava #9

Closed frankRenlf closed 3 months ago

frankRenlf commented 4 months ago

I've followed the process, but I'm three points under. What other details do I need to pay attention to in order to reproduce the results of the paper?

LengSicong commented 3 months ago

Hi, may I know more details about your experiments so that we can help you better?

frankRenlf commented 3 months ago

the data of std in table like (+-0.42), what will cause the result fluctuating? For the seed, results maintain the same. And I found the results of regular produced by the code in your repo is higher than the results in your paper.

LengSicong commented 3 months ago

The fluctuating may come from the random sampling process, depending on the sampling strategies. Moreover, the process of adding random gaussian noise may also introduce randomness.

For the regular decoding, are u using a newer version of LLaVA?

frankRenlf commented 3 months ago

thx for your reply. Yes, I just use llava7b.

And I want to know how's the std comes from.

LengSicong commented 3 months ago

Hi, the decoding strategy for the main table is always direct sampling without any constraints (i.e., top p, temperature normalization). The std may come from the sampling process or the randomness while adding the Gaussian noise to obtain a noised image.

frankRenlf commented 3 months ago

Hi, but for regular, it also has std in results, what causes this? When I set seed, the results will not change. So the std may come from sampling process and the difference is got by difference seeds?

LengSicong commented 3 months ago

Yep, correct. When the seed is different, the sampling process during decoding and the process of adding Gaussian noises to the original image would both introduce randomness that may cause the std.

haohaodw commented 2 months ago

Hi, I still don't know which version of llava to use to reproduce the results. Can you share the link of Huggingface?

LengSicong commented 2 months ago

https://huggingface.co/liuhaotian/llava-v1.5-7b

HaozheZhao commented 1 month ago

https://huggingface.co/liuhaotian/llava-v1.5-7b

Hi, Sicong, I have a problem that why the result report in the paper is much lower than the performance report in the original LLAVA 1.5 paper?

For example, in the orginal LLAVA-1.5 paper, it report the POPE F1 score in three split( Ran, Adv and Pop): 87.3 86.1 84.2

But in your paper, the result is 81.33, 77.57 and 80.06.