issues
search
sarahmart
/
HARDMath
A new dataset of difficult graduate-level applied mathematics problems; evaluations demonstrate that leading LLMs currently exhibit low accuracy in solving these problems.
MIT License
3
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
missing requirements.yml file
#3
angie-chen55
opened
10 hours ago
1
Question about few-shot evaluation
#2
beichenzbc
closed
3 weeks ago
1
Evaluation test data
#1
wedu-nvidia
closed
2 months ago
3