Open yjch00 opened 2 months ago
Also, could you share the POPE performance (COCO, AOKVQA, GQA) not only for your model but also for the other models?
Hi, thanks for your interest!
'avg.len' in the CHAIR is the average length of tokens after nltk.word_tokenize, please refer to the evaluation file for more details (https://github.com/yuezih/less-is-more/blob/main/CHAIR-eval/chair.py).
As for the performance of the compared model, the performance of Regular and VCD is cited from the results of VCD. As for DOLA and OPERA, I cannot get the original results now since I am not in school. You can refer to the implementation at https://github.com/BillChan226/HALC.
Thank you very much for your kind response.
Great work! I reviewed it. I have a question—does 'avg.len' in the CHAIR experiment of the paper refer to the average length of tokens?