microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
https://aka.ms/GeneralAI
MIT License
19.09k stars 2.44k forks source link

[MWPBench] AGIEval-Math is actually a part of MATH/Test. Why Including both of them? #1533

Open tongyx361 opened 2 months ago

tongyx361 commented 2 months ago

According to http://arxiv.org/abs/2403.02884, AGIEval-Math is actually a part of MATH/Test.

XingxingZhang commented 2 months ago

According to http://arxiv.org/abs/2403.02884, AGIEval-Math is actually a part of MATH/Test.

By reviewing the dataset, we have verified that the MATH test indeed includes AGIEval-Math. Thanks for pointing this out, @tongyx361!

We will update both the test set and the corresponding results accordingly.