issues
search
dvlab-research
/
MR-GSM8K
Challenge LLMs to Reason About Reasoning: A Benchmark to Unveil Cognitive Depth in LLMs
MIT License
40
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
How to calculate the scores?
#4
zhangxjohn
closed
6 months ago
2
BUG: error_reason_correctness
#3
zhangxjohn
opened
7 months ago
1
the new version
#2
Shijie-Xia
opened
9 months ago
9
question abount the inconsistency between "model_output_solution_correctness" and "model_output_solution_first_error_step"
#1
Shijie-Xia
closed
9 months ago
3