Closed pbevan1 closed 7 months ago
Hello! Please refer to #9 for the explanation of number of noun phrases. If the question is an empty string, the BLIP score would be set to 1 (L#99-100 in BLIPvqa_eval /BLIP_vqa.py). The final calculation involves the multiplication of BLIP scores from different noun phrases, and empty strings would not affect the final score as they are computed as a multiplication of 1.
Great, that explains it, thanks. Sorry I forgot to check the closed issues!
I noticed the default number of noun phrases is 8 for the BLIP eval, but from 2 onwards all/most of the questions are empty strings.. So the final calculation is averaging across a lot of invalid responses? Am I correct here or am I missing something? Shouldn't this default to 2?