Closed ryangawei closed 4 years ago
For the TREC-QA Dataset, we follow the norm of using the "Clean TREC-QA" setting in which we remove the questions from the dev. and test sets which have no answers, all correct or all incorrect answer sentence candidates (We still use the original train set). For more details, you can refer to the original explanation of "Clean TREC-QA" (Inter-Weighted Alignment Network for Sentence Pair Modeling, Shen et al, EMNLP 2017)
The statistics reported in our paper are correct (1229/65/68 questions for train/dev/test splits). We use train-all.jsonl for the train corpus and the {dev/test}-filtered.jsonl for the dev/test corpus from the linked repository.
For the TREC-QA Dataset, we follow the norm of using the "Clean TREC-QA" setting in which we remove the questions from the dev. and test sets which have no answers, all correct or all incorrect answer sentence candidates (We still use the original train set). For more details, you can refer to the original explanation of "Clean TREC-QA" (Inter-Weighted Alignment Network for Sentence Pair Modeling, Shen et al, EMNLP 2017)
The statistics reported in our paper are correct (1229/65/68 questions for train/dev/test splits). We use train-all.jsonl for the train corpus and the {dev/test}-filtered.jsonl for the dev/test corpus from the linked repository.
That makes sense. Thank you for your reply!
In the origin paper you states that the filtered TREC QA dataset has 1, 229, 65 and 68 questions. However, in the repo you use
*-filtered.jsonl
from here, and the statistic does not correspond to the above number. Is that a mistake in the paper?