openai / mle-bench

MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering
https://openai.com/index/mle-bench/
Other
529 stars 59 forks source link