Open chang-github-00 opened 3 months ago
Solved, it's due to missing installation of a package that supports parse_latex
. BTW, it seems that the latex parser and the expression parser could yield different evaluation results (https://github.com/deepseek-ai/DeepSeek-Coder/blob/main/Evaluation/PAL-Math/utils/grader.py#L102). It would be more accurate to grade the solution as correct as long as it passes through one test.
The grader used in math evaluation (PAL-MATH) cannot pass the
_test_math_equal()
. All three samples returned False.