arkilpatel / SVAMP

NAACL 2021: Are NLP Models really able to Solve Simple Math Word Problems?
MIT License
116 stars 34 forks source link

Data size mismatch in SVAMP #2

Closed jhshen95 closed 3 years ago

jhshen95 commented 3 years ago

Hi, according to the paper, the train and dev set under SVAMP/data/mawps-asdiv-a_svamp should be of size 3591 and 1000, respectively.

However, there are 3138 examples in train.csv and 1790 in dev.csv. Why do the numbers mismatch?

arkilpatel commented 3 years ago

Hi, thanks for pointing this out.

I had mistakenly uploaded dev set files from another augmentation-related experiment. I have pushed the correct files now which have the 1000 examples of SVAMP.

Note that while the official size of MAWPS is 2373 problems, we work with 1921 problems that are within the scope of our task. If you take a look at the problems in MAWPS here, you will notice that there are many troublesome problems such as those that require "rounding to the nearest tenth", etc. These problems have been eliminated. Even the implementation of Graph2Tree works with only these 1921 problems. Similarly, we had eliminated one problem from ASDiv-A for being out of scope, which gives us 1217 problems. Hence the train set size = 1921 + 1217 = 3138.