关于代码调试问题的请教

vio0109 commented 6 years ago

您好：我在根据您训练好的模型，预测kp20k以及其他测试集时，也遇到了如下错误 Loading testing dataset KP20k from /home/majun/keyphrase_generation/seq2seq-keyphrase-master/dataset/keyphrase/baseline-data/kp20k/ kp20k Size of test data=0 /home/majun/anaconda2/lib/python2.7/site-packages/numpy/lib/function_base.py:1110: RuntimeWarning: Mean of empty slice. avg = a.mean(axis) Traceback (most recent call last): File "/home/majun/anaconda2/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/home/majun/anaconda2/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/home/majun/keyphrase_generation/seq2seq-keyphrase-master/keyphrase/keyphrase_copynet.py", line 533, in print('Avg length=%d, Max length=%d' % (np.average([len(s) for s in test_set['source']]), np.max([len(s) for s in test_set['source']]))) File "/home/majun/anaconda2/lib/python2.7/site-packages/numpy/core/fromnumeric.py", line 2272, in amax out=out, **kwargs) File "/home/majun/anaconda2/lib/python2.7/site-packages/numpy/core/_methods.py", line 26, in _amax return umr_maximum(a, axis, None, out, keepdims) ValueError: zero-size array to reduction operation maximum which has no identity

参考之前issues的解决办法，调试了一下，还是报错。想再请教一下，这个问题怎么弄啊？十分感谢。（theano版本是0.8.2）

memray commented 6 years ago

“Size of test data=0” means it doesn't load the test data correctly. Could you check that part out?

vio0109 commented 6 years ago

我查看了这个路径 /home/majun/keyphrase_generation/seq2seq-keyphrase-master/dataset/keyphrase/baseline-data/kp20k/ 里面确实有keyphrase和text文件夹。而且文件夹里也有下载好的数据

memray commented 6 years ago

It's my bad. Please check out the latest code. The previous one loads .keyphrase files instead of .txt. I only tested it on CPU (I cannot run GPU version after updating CUDA and Theano) and hope it will work now.

vio0109 commented 6 years ago

非常感谢您的回复！我下载了您最新的源码，并将Experiment/ 和 dataset/放在了project中。很抱歉还要一些路径的问题需要咨询您一下。

执行predict部分,将其他置为false。预测kp20k的时候，首先遇到以下错误:
我修改了一下keyphrase_test_dataset.py中的路径，可能改的不是很对。

之后可以运行了

但是在运行过程中，我并没有找到config['predict_path']/predict.generative.dataset_name.pkl这个生成文件，是全部预测完才会生成吗？？

2.预测SemEval。

想请教一下这些路径都代表什么？如何加载才不会报错

memray commented 6 years ago

Yes, predict.generative.dataset_name.pkl will be generated once all predictions are finished. So it takes very long time on KP20K.
I think they are all included in seq2seq-keyphrase.zip

memray / seq2seq-keyphrase

关于代码调试问题的请教 #18