Closed etr2460 closed 6 months ago
With this improvement we now have a 0-shot performance of 59.6% (averaged over 3 eval runs) on the MMMU validation set, which beats the 56.8% reported in the MMMU paper
With this improvement we now have a 0-shot performance of 59.6% (averaged over 3 eval runs) on the MMMU validation set, which beats the 56.8% reported in the MMMU paper