indicator evaluation - Githubissues

BradyFU / Woodpecker

✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models. The first work to correct hallucinations in MLLMs.

614 stars 29 forks source link

indicator evaluation #7

Closed Rh-Dang closed 1 year ago

Rh-Dang commented 1 year ago

I am very interested in your work, could you please publish the code related to your indicator evaluation? Thanks very much!!!

xjtupanda commented 1 year ago

Thanks for your attention! Could you please specify which part, since 'indicator' evaluation is vague?

Rh-Dang commented 1 year ago

Thank you very much for your reply! The indicators I mentioned are some of the evaluation indicators that were quantitatively tested in your paper. like on POPE, MME benchmark and GPT-4V-aided evaluation. We would like to replicate your results on these benchmarks.

xjtupanda commented 1 year ago

You may follow the official evaluation protocol as we did. The repos for benchmarks provide evaluation tools to calculate metrics automatically.

POPE
MME The prompt for GPT-4V-Aided Evaluation is available at Link