huggingface / lighteval

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends
MIT License
832 stars 99 forks source link

Math normalization: do not crash on invalid format #331

Closed guipenedo closed 2 months ago

guipenedo commented 2 months ago

Wrap the asserts in a try except so that malformatted generations do not crash lighteval. See the original implem: https://github.com/hendrycks/math/blob/main/modeling/eval_math_gpt.py