dvlab-research MR-GSM8K issues - Githubissues

dvlab-research / MR-GSM8K

Challenge LLMs to Reason About Reasoning: A Benchmark to Unveil Cognitive Depth in LLMs

MIT License

40 stars 0 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

How to calculate the scores?

#4 zhangxjohn closed 6 months ago
2
BUG: error_reason_correctness

#3 zhangxjohn opened 7 months ago
1
the new version

#2 Shijie-Xia opened 9 months ago
9
question abount the inconsistency between "model_output_solution_correctness" and "model_output_solution_first_error_step"

#1 Shijie-Xia closed 9 months ago
3