Lackel / AGLA

Code for paper "AGLA: Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention"
14 stars 0 forks source link

About avg.length #4

Open yjch00 opened 2 months ago

yjch00 commented 2 months ago

Great work! I reviewed it. I have a question—does 'avg.len' in the CHAIR experiment of the paper refer to the average length of tokens?

yjch00 commented 2 months ago

Also, could you share the POPE performance (COCO, AOKVQA, GQA) not only for your model but also for the other models?

Lackel commented 2 months ago

Hi, thanks for your interest!

'avg.len' in the CHAIR is the average length of tokens after nltk.word_tokenize, please refer to the evaluation file for more details (https://github.com/yuezih/less-is-more/blob/main/CHAIR-eval/chair.py).

As for the performance of the compared model, the performance of Regular and VCD is cited from the results of VCD. As for DOLA and OPERA, I cannot get the original results now since I am not in school. You can refer to the implementation at https://github.com/BillChan226/HALC.

yjch00 commented 2 months ago

Thank you very much for your kind response.