BillChan226 / HALC

[ICML 2024] Official implementation for "HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding"
https://billchan226.github.io/HALC
MIT License
53 stars 1 forks source link

The Fig 1. experiment detail #4

Open Stevetich opened 1 month ago

Stevetich commented 1 month ago

I have read your paper, it is a great work. However, I have a few questions about the details of Fig 1. experiment, which you have conducted to verify the significance of optimal visual context. What I don't understand is that after the brutal force search for the optimal visual context, how is the visual context used to generate the correct answer? Is it used as an individual image as the visual input or used in your proposed framework? Thank you for your explanation!

BillChan226 commented 1 month ago

Hi, thanks for your interest in HALC! For the images searched with brutal force, we simply input it into the VLM as an individual image. You can also manually fix the retrieved optimal visual context into our framework so that you can automate the inference process. Please let me know if there's any further questions, thanks:)

Stevetich commented 1 month ago

Thanks for your reply! I have no more questions.