google-deepmind / mathematics_dataset

This dataset code generates mathematical question and answer pairs, from a range of question types at roughly school-level difficulty.
Apache License 2.0
1.8k stars 249 forks source link

Where is the rest of the pre-generated data? #10

Closed mathemakitten closed 4 years ago

mathemakitten commented 4 years ago

hi! I'm looking to reproduce the results in the paper but when downloading from this link on the README, the .tar file only contains train-medium.

The github README says Note the training data for each question type is split into "train-easy", "train-medium", and "train-hard".

Should the train-easy, train-hard, and test-easy/medium/hard datasets be included in the pre-generated files off of GCP, or do we need to generate it ourselves? If the latter, how can we ensure that the generated data/results are the exact same as the paper?

davidsaxton commented 4 years ago

I've just downloaded the .tar.gz file and confirmed it contains the subdirectories interpolate, extrapolate, train-easy, train-medium, train-hard (and the file train-readme.txt).

Please re-download and extract again, possibly you had an incomplete download? If your issue persists please reopen this issue.