Closed MrShininnnnn closed 1 year ago
@MrShininnnnn hello, I found this question today, did you solve it?
We are sorry for the clerical error in our paper. After detailed inspection, we find that we sample 10k question pairs for valid and test datasets respectively. Therefore, the actual number of training, valid and test datasets are 129,263/10k/10k. And we release the whole valid and test dataset's outputs in the result folder. So there are 2w samples for each round of paraphrase generation.
And actually, we find that the number of valid and test datasets also includes 1w samples respectively.
For Quora, there are actually 149,263 samples in total, rather than the data split reported in the paper (129,263\3k\3k). Is there a reason why not to use the full dataset? Thanks.