baidu / DuReader

Baseline Systems of DuReader Dataset
http://ai.baidu.com/broad/subordinate?dataset=dureader
1.14k stars 308 forks source link

用demo数据运行paddle的问题 #24

Closed urcllr closed 6 years ago

urcllr commented 6 years ago

运行paddle的infer这步,需要用到preprocessed/testnet下的数据 dureader

但github代码中并没有demo对应的preprocessed数据。我按Preprocess the Data那节的命令来生成testnet预处理数据出错(trainset和devset成功,只有testset失败。经查search.test.json的确没有segmented_answers键) paddle-demo-preprocess

这样导致用demo的数据无法执行paddle infer这步,执行完后models底下的infer目录是空的。

lkliukai commented 6 years ago

For convenience, please use preprocessed version of our dataset, or segment questions, documents and references by yourself and provide "segmented_*" field for the preprocess script.

urcllr commented 6 years ago

不好意思,是我昨天没理解透preprocess个章节,看漏了要自己先分词放入对应的segmented字段。