Open allanj opened 3 years ago
yes, mawps-s is 5-fold setting.
Thanks. Am I right that, for SVAMP, you are just directly doing train and test following the SVAMP paper?
SVAMP is just a dataset for test, according to SVAMP paper, trainset consists of mawps and asdiv-a. And the setting is train-test split.running it with k-fold cross validation may not a good idea.
Got it. Maybe should specify them in the table/paper?
From the table, it seems only those marked with "*" are train-test split.
In the SVAMP paper, the appendix A show that the transformers with Roberta encoder obtain 38.9 accuracy
But it seems the RobertaGen only get 30.3 here. Curious about the difference here
Is the experiment for MAWPS-s using 5-fold as well? It seems yes to me as the paper reported. I got around 85.4 accuracy on MAWPS using train/dev/test. Wondering if I'm correct here.