更改过数据集结构的吗 - Githubissues

ymcui / cmrc2018

A Span-Extraction Dataset for Chinese Machine Reading Comprehension (CMRC 2018)

https://ymcui.github.io/cmrc2018/

Creative Commons Attribution Share Alike 4.0 International

411 stars 87 forks source link

更改过数据集结构的吗 #5

Closed xiongma closed 4 years ago

xiongma commented 4 years ago

在你的baseline代码里发现的数据集结构跟现有github上的目录结构完全不一样

ymcui commented 4 years ago

Baseline结构应该是用了SQuAD-style的数据集，codalab可以下载：https://worksheets.codalab.org/worksheets/0x92a80d2fab4b4f79a2b4064f7ddca9ce

xiongma commented 4 years ago

dev是测试集吗 trail是挑战集吗

ymcui commented 4 years ago

dev：开发集 trial：试验集 test：测试集 challenge：挑战集

ymcui commented 4 years ago

测试集与挑战集不开放，仅接受提交系统后评测。具体方法请参考codalab页面。

xiongma commented 4 years ago

好

xiongma commented 4 years ago

可以提供下你们你们在Chinese-BERT-wwm训练cmrc2018的代码吗

ymcui commented 4 years ago

模型方面没有改动，数据格式你对的上基本就可以了。

xiongma commented 4 years ago

好的