nyu-dl / dl4ir-searchQA

BSD 3-Clause "New" or "Revised" License
179 stars 17 forks source link

Empty ques. in train file #1

Open vardaan123 opened 6 years ago

vardaan123 commented 6 years ago

Some questions in training file available at the location https://drive.google.com/open?id=0B51lBZ1gs1XTR3BIVTJQWkREQU0 are of zero length (I assume '|||' is the delimiter for context, ques., answer respectively. Some ques. are one character long, which also doesn't make sense. Also, you have written that the scripts to generate the train, val, test files shall be released, I request you to release that soon, thanks!

mattyd2 commented 6 years ago

@vardaan123 please find the raw, split and processed data here.

https://drive.google.com/drive/u/2/folders/1kBkQGooNyG0h8waaOJpgdGtOnlb1S649

I can share the preprocessing scripts with you.