memray / seq2seq-keyphrase

MIT License
318 stars 109 forks source link

关于代码调试问题的请教 #18

Closed vio0109 closed 6 years ago

vio0109 commented 6 years ago

您好: 我在根据您训练好的模型,预测kp20k以及其他测试集时,也遇到了如下错误 Loading testing dataset KP20k from /home/majun/keyphrase_generation/seq2seq-keyphrase-master/dataset/keyphrase/baseline-data/kp20k/ kp20k Size of test data=0 /home/majun/anaconda2/lib/python2.7/site-packages/numpy/lib/function_base.py:1110: RuntimeWarning: Mean of empty slice. avg = a.mean(axis) Traceback (most recent call last): File "/home/majun/anaconda2/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/home/majun/anaconda2/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/home/majun/keyphrase_generation/seq2seq-keyphrase-master/keyphrase/keyphrase_copynet.py", line 533, in print('Avg length=%d, Max length=%d' % (np.average([len(s) for s in test_set['source']]), np.max([len(s) for s in test_set['source']]))) File "/home/majun/anaconda2/lib/python2.7/site-packages/numpy/core/fromnumeric.py", line 2272, in amax out=out, **kwargs) File "/home/majun/anaconda2/lib/python2.7/site-packages/numpy/core/_methods.py", line 26, in _amax return umr_maximum(a, axis, None, out, keepdims) ValueError: zero-size array to reduction operation maximum which has no identity

参考之前issues的解决办法,调试了一下,还是报错。想再请教一下,这个问题怎么弄啊? 十分感谢。(theano版本是0.8.2)

memray commented 6 years ago

“Size of test data=0” means it doesn't load the test data correctly. Could you check that part out?

vio0109 commented 6 years ago

我查看了这个路径 /home/majun/keyphrase_generation/seq2seq-keyphrase-master/dataset/keyphrase/baseline-data/kp20k/ 里面确实有keyphrase和text文件夹。而且文件夹里也有下载好的数据

memray commented 6 years ago

It's my bad. Please check out the latest code. The previous one loads .keyphrase files instead of .txt. I only tested it on CPU (I cannot run GPU version after updating CUDA and Theano) and hope it will work now.

vio0109 commented 6 years ago

非常感谢您的回复!我下载了您最新的源码,并将Experiment/ 和 dataset/放在了project中。很抱歉还要一些路径的问题需要咨询您一下。

  1. 执行predict部分,将其他置为false。预测kp20k的时候,首先遇到以下错误: 1

    我修改了一下keyphrase_test_dataset.py中的路径,可能改的不是很对。

    2

    之后可以运行了

    3

    但是在运行过程中,我并没有找到config['predict_path']/predict.generative.dataset_name.pkl这个生成文件,是全部预测完才会生成吗??

2.预测SemEval。

4

想请教一下这些路径都代表什么?如何加载才不会报错

5
memray commented 6 years ago
  1. Yes, predict.generative.dataset_name.pkl will be generated once all predictions are finished. So it takes very long time on KP20K.
  2. I think they are all included in seq2seq-keyphrase.zip