Closed danyaljj closed 4 years ago
Please see in the models directory readme, last command for training in AllenNLP:
python -m allennlp.run train models/MultiQA_BERTBase.jsonnet -s ../Models/MultiTrain -o "{'dataset_reader': {'sample_size': 75000}, ...
{'sample_size': 75000} in the DatasetReader trains only on a sub-sample of the training set of size 75,000 examples.
Somewhere in the paper you say:
Looking at the command-line it doesn't look like you have any way of specifying size of the training data. I also couldn't find anywhere in the code where you take a subset of the training data. Just wanted to confirm that the code uses all of the training data.