Thank you for open-sourcing such a great evaluation repo!
I’m Shan Chen, primarily working in the field of health AI at Harvard.
We’ve recently developed a multimodal evaluation dataset, cleaning up medical exam data from four countries (Japan, Israel, Spain, and Brazil) over several years.
This work is being submitted to NAACL 2025, and the format of our data is the VLMEvalKit MCQ format (identical to MMbench, etc.). We hope it can be integrated into the official repo. We can provide you guys the huggingface dataset link for smooth transformation.
Hello everyone,
Thank you for open-sourcing such a great evaluation repo!
I’m Shan Chen, primarily working in the field of health AI at Harvard.
We’ve recently developed a multimodal evaluation dataset, cleaning up medical exam data from four countries (Japan, Israel, Spain, and Brazil) over several years.
This work is being submitted to NAACL 2025, and the format of our data is the VLMEvalKit MCQ format (identical to MMbench, etc.). We hope it can be integrated into the official repo. We can provide you guys the huggingface dataset link for smooth transformation.
Thanks a lot! Shan Chen