openai / mle-bench

MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering
Other
422 stars 39 forks source link

Why 82 datasets are in `./mlebench/compositions` folder? #5

Closed JK-SHIN-PG closed 1 week ago

JK-SHIN-PG commented 1 week ago

Hello,

Thanks for the great work!

I noticed that seven datasets listed in ./mlebench/compositions were not mentioned in the paper. Could you please confirm whether this indicates that the dataset has been extended?

james-aung commented 1 week ago

Thanks for the question. There are 7 competitions in the dev split beyond the 75 main competitions.

https://github.com/openai/mle-bench/blob/main/experiments/splits/dev.txt