baidu / DuReader

Baseline Systems of DuReader Dataset
http://ai.baidu.com/broad/subordinate?dataset=dureader
1.13k stars 308 forks source link

用dureader_preprocessed.zip的数据运行paddle的train出错 #25

Closed urcllr closed 5 years ago

urcllr commented 6 years ago

我用data/download.sh下载了dureader_preprocessed.zip,解压得到testset、trainset、devset的各个json,我把每个json都head -n 3000截取头3000行。前面的步骤没有错误,到了train这步出现一堆SKIP: paddle-train

之后infer也没提示出错,但运行完之后models底下的infer目录为空 ![Uploading paddle-infer.png…]()

之前试过不截取数据直接运行,train那一步也是有一堆SKIP的。只不过机器实在达不到要求,train还没运行完就挂掉了,也就没测试到后续的infer。

lkliukai commented 6 years ago

The reason of "skip, wrong answer docs" is our program assumes that one and only one answer document should be found by preprocessed script, but there will be NONE answer document when the question does not have any answer. The warning is normal and could be ignored, it is unrelated to the inference. For testing inference, you should run:

bash run.sh test_bidaf bidaf infer --testset ../data/preprocessed/testset/search.test.json

urcllr commented 6 years ago

谢谢你的回复。 我昨天也有继续运行过你说的那条命令的,似乎没有什么错误出现,截图就是后面那条padding-inner.png一直在uploading,我试下在这里再上传一遍。不过运行完infer之后models下面的infer目录为空,那结果文件放在哪里呢? paddle-infer 1

lkliukai commented 6 years ago

The result file is supposed to be under the path 'model_name/infer', please check your log file under '.../test_bidaf/log' to locate problems.

urcllr commented 6 years ago

以下是model/test_bidaf/log/train.log(目录下就只有train.log一个文件)的一头一尾截图,中间省略的内容都是类似的skip…提示。想知道这情况下是哪里出了问题。

abc

urcllr commented 6 years ago

没人知道这里出了什么问题吗?

KruskalLin commented 6 years ago

同遇到这个问题 后面直接运行发现是被kill了

lkliukai commented 6 years ago

以下是model/test_bidaf/log/train.log(目录下就只有train.log一个文件)的一头一尾截图,中间省略的内容都是类似的skip…提示。想知道这情况下是哪里出了问题。

abc

The training log seems normal. What about it is running on demo data? python utils/get_vocab.py --files data/demo/trainset/search.train.json data/demo/devset/search.dev.json --vocab data/demo/vocab.search