Open HaozheZhao opened 12 months ago
Sure, We follow exact Winoground's evaluation, you can see a very similar code here: https://github.com/yonatanbitton/CLIPEvaluation/blob/main/src/eval_winoground.py The only difference is that instead of CLIP score, we used BLIP2 image-text score. Please let me know if it's clearer now and if there are additional questions
Could you please explain how you precisely assess your methods and Blip2 using the Winoground metric?