Closed vishaal27 closed 3 days ago
Hello Vishaal, Thank you for your interest in our work! You can find the detailed evaluation results for all tasks in [individual_results.csv], including 21 classification tasks.
For each benchmark in columns, ::
serves as a delimiter for scores within individual benchmarks.
For Winoground-style datasets that include sub-tasks (e.g., eqben and mmvp_vlm), an additional ::
separates 'text', 'image', and 'group' scores.
Codes will be uploaded upon upload approval soon.
Thanks,
thanks so much, this is great!
Hey, thanks for your great work and publicly releasing your results and code. I am very interested in obtaining the individual per-dataset (non-compositionality) results for all the models in your results.csv file. Would you please be able to release the individual model scores on each of the 21 datasets independently? That would be awesome, thanks in advance!