BradyFU / Woodpecker

✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models. The first work to correct hallucinations in MLLMs.
580 stars 29 forks source link

indicator evaluation #7

Closed Rh-Dang closed 8 months ago

Rh-Dang commented 8 months ago

I am very interested in your work, could you please publish the code related to your indicator evaluation? Thanks very much!!!

xjtupanda commented 8 months ago

Thanks for your attention! Could you please specify which part, since 'indicator' evaluation is vague?

Rh-Dang commented 8 months ago

Thank you very much for your reply! The indicators I mentioned are some of the evaluation indicators that were quantitatively tested in your paper. like on POPE, MME benchmark and GPT-4V-aided evaluation. We would like to replicate your results on these benchmarks.

xjtupanda commented 8 months ago

You may follow the official evaluation protocol as we did. The repos for benchmarks provide evaluation tools to calculate metrics automatically.