husseinmozannar / SOQAL

Arabic Open Domain Question Answering System using Neural Reading Comprehension
MIT License
159 stars 33 forks source link

ARCD test and train datasets #4

Closed ahmednasserswe closed 4 years ago

ahmednasserswe commented 4 years ago

Hello,

I can not find the ARCD test and train sets, I can only find one file under /data that is called arcd.json. How can I find the test and train split as indicated in the paper? I'd like to compare my results to yours that's why I am interested in finding the exact datasets used.

شكرا!

husseinmozannar commented 4 years ago

Hey Ahmed, I just uploaded the train and test splits as separate files, the split is 50-50 and is based on articles, paragraphs from the same article only appear in one of the splits. If you have the computational resources you could do two-fold CV (switch roles of train and test, or do a different split) for a better validation of your model.