husseinmozannar / SOQAL

Arabic Open Domain Question Answering System using Neural Reading Comprehension
MIT License
159 stars 33 forks source link

Arabic-SQUAD train/dev/test splits #22

Open spookyQubit opened 3 years ago

spookyQubit commented 3 years ago

Hi @husseinmozannar , thanks a lot for sharing the code and the data.

Had a small question regarding Arabic-Squad: In the paper, it is mentioned that Arabic-SQuAD is split 80-10-10% into three parts for training, development and testing: Arabic-SQuad-Test is composed of 2,966 questions on 24 articles; note that articles are distinct between the parts.

Is there an official split which one should use for Table 5 of the paper? The reason I ask is that the Arabic-SQuAD.json comes without any train/dev/test markings (unless I am not looking at the correct file). Will it be possible to please share the splits?

Thanks a lot in advance.