Open haohaodw opened 3 months ago
Thanks for your interest! In our experiments, we have observed that the responses from the four LVLMs to POPE questions are in the format as "Yes/No, there is/isn't {object} ..." This format allows LURE to mask the object. For instance, the responses of mPLUG-Owl to some POPE questions are listed below:
The responses of LLaVA-1.5 to some POPE questions are listed below:
However, when calculating the accuracy of POPE, the calculation is yes or no. So how do you judge whether the modified response of LURE is correct?
A nice work. I would like to ask a question about LURE. LURE needs to mask the object during inference and then correct it. However, POPE and MME are discriminant tasks, using YES/NO to answer questions. How do you test the performance of LURE on these two data sets?