Open MajorDavidZhang opened 10 months ago
I could not get the same results either, and I found that repeating the same command multiple times yielded different results on 'Positional and Relational Context', 'Structural Characteristics', and 'Orientation and Direction'. But I got the same results in other categories.
Similar issue, unable to reproduce the results in Table 1 even when running the exact code in the repo. @tsb0601 any advice? Unlike @lst627, I get consistent results across multiple runs, but they are consistently worse than the reported numbers.
Still no update on this
+1
Hi, thanks for your insightful work! I am using your MMVP benchmark to test different CLIP models' performance. However, when I run the exactly code from evaluate_vlm.py, I cannot get the same results as in Table 1 in the paper. My results are:
, which is different from the first row of Table 1 in the paper, and different from all the rows of Table 1. Could you confirm that? Thanks very much!