Problem in Math Evaluation

deepseek-ai / DeepSeek-Coder

DeepSeek Coder: Let the Code Write Itself

https://coder.deepseek.com/

MIT License

6.8k stars 473 forks source link

Open chang-github-00 opened 3 months ago

chang-github-00 commented 3 months ago

The grader used in math evaluation (PAL-MATH) cannot pass the _test_math_equal(). All three samples returned False.

chang-github-00 commented 3 months ago

Solved, it's due to missing installation of a package that supports parse_latex. BTW, it seems that the latex parser and the expression parser could yield different evaluation results (https://github.com/deepseek-ai/DeepSeek-Coder/blob/main/Evaluation/PAL-Math/utils/grader.py#L102). It would be more accurate to grade the solution as correct as long as it passes through one test.