Why is every answer in Structural Engineering just "?"

MMMU-Benchmark / MMMU

This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"

https://mmmu-benchmark.github.io/

Apache License 2.0

327 stars 21 forks source link

Why is every answer in Structural Engineering just "?" #7

Closed mckinziebrandon closed 9 months ago

mckinziebrandon commented 9 months ago

I was browsing the test set with the Dataset Viewer on HuggingFace (Link) and noticed that, for the Structural Engineering subset of Architecture_and_Engineering, literally every single answer and explanation is equal to "?". Surely this is a bug?

mckinziebrandon commented 9 months ago

Oh, never mind. I guess labels aren't provided for test, so users can't evaluate their own models.