Open qppwdd0324 opened 3 months ago
Thank you for your prompt reply. May I ask what these functions in halc.py represent? Their comments are the same: "The method uses a list of context windows rooted from the DINO detection one and applies the contrasting decoding method to each context window pair to get a list of contrasting logits. Then we use the..." ![Uploading 捕获.JPG…]()
Sorry, the image seems not to be uploaded successfully. These functions with the same comment are mainly different contrasting methods to contrast the various sampled FOV logits with each other. You can view this line to see how they are being used. Eventually we have used the context_layer_double_multi_contrastive_decoding function as described in our paper.
Hi,
Thanks for your interest in this project!!
Sorry if the Readme is not written clear enough, here is a snippet from Readme which describes how to run caption generation for CHAIR and POPE.
:chair: Running CHAIR evaluation for LVLMs object hallucination
Following Evaluating Object Hallucination in Large Vision-Language Models, we used "Please describe this image in detail." as the prompt to query LVLM for captions of the
500
images randomly sampled from COCO 2014 Val datast. Under root directory, run--debugging 1
will print the intermediate hallucination correction process of HALC.:man_in_tuxedo: Running POPE evaluation for LVLMs object hallucination
Since OPOPE evaluates directly based on the caption generated for each image, it follows the caption generation procedure for CHAIR and differs in the subsequent metric calculation. To collect samples for the conventional POPE evaluation, under root directory, run
You can also directly run the demo file here to test single image captioning. To run this demo, you can put the directory of the image you want to evaluate in this list, and then run
We hope this would be helpful for you to run HALC and we will improve the Readme later to be more clear. If there's further questions please don't hesitate to ask:)