sarahmart / HARDMath

A new dataset of difficult graduate-level applied mathematics problems; evaluations demonstrate that leading LLMs currently exhibit low accuracy in solving these problems.
MIT License
3 stars 0 forks source link