Why the corpus_bleu is 0.0001? I have trained the model.

satinewee commented 3 years ago

decaying learning rate to :0.061 decaying learning rate to :0.058 step 46000 epoch 43 learning rate 0.058 step-time 3.065 loss 0.764 test eval loss:123.05 start decoding corpus_bleu:0.0001 avg_score:0.2200

And where is the finally output? Is it in the model/eval/test.46000.out? Thank you.

satinewee commented 3 years ago

您好，很感谢您回复我的问题，猜测您应该也是中国人，所以我就使用中文了，害怕英文表达不好我的意思。

1、附件中是我的代码及数据集，config.yaml在source code文件夹下，dataset中是数据集，model中是模型运行的输出。 2、我使用该代码跑自己的数据集时（即本附件中的数据集，该数据集是[1]中的数据集，但据该论文所说，这篇数据集也是从您[2]的论文数据集中获取的)，每次输出corpus bleu都是0，并且最终生成的test.xxx.out的bleu评分只有0.02. 3、此外，我用该代码跑deepcom本身的数据集时，每次输出corpus bleu也是0或0.0001，不知道是否是代码中存在什么问题？ 4、或者是否是因为当前这份数据集总体长度是deepcom数据集的两倍，所以模型无法较好的为其生成结果？

如果能有幸再次获得您的回复，我感激不尽！

[1]Jian Zhang,Xu Wang,Hongyu Zhang,et al,Retrieval-based Neural Source Code Summarization. [2] Xing Hu, Ge Li, Xin Xia,et al. 2018. Summarizing source code with transferred API knowledge.

------------------ 原始邮件 ------------------ 发件人: "xing-hu/EMSE-DeepCom" <notifications@github.com>; 发送时间: 2021年2月18日(星期四) 下午2:41 收件人: "xing-hu/EMSE-DeepCom"<EMSE-DeepCom@noreply.github.com>; 抄送: "3248777"<3248777@qq.com>;"Author"<author@noreply.github.com>; 主题: Re: [xing-hu/EMSE-DeepCom] Why the corpus_bleu is 0.0001? I have trained the model. (#23)

can you share your code?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

xing-hu commented 3 years ago

The final corpus bleu score is computed by multi-bleu.perl instead of nltk. Is the score computed by multi-bleu.perl is 0.000 too?

satinewee commented 3 years ago

您好，感谢您的回复，我按您所说重新计算得到以下结果。 BLEU=3.63 20.8/5.4/3.0/2.2 (BP=0.698,ratio=0.736,hyp_len=80639,ref_len=109631) 需要说明的是，我在此处使用的数据集是[1]论文的数据集，但据论文所述该数据集是从您[2]的论文中所获取的，请问是否是因为我所使用的数据集过长(平均长度是deepcom数据集的两倍)所导致的效果不好呢？还是存在其他方面问题？如果能再次得到您的回复，我将感激不尽。 [1] Jian Zhang，Retrieval-based Neural Source Code Summarization [2] Xing Hu, Ge Li, Xin Xia, David Lo, Shuai Lu, and Zhi Jin. 2018. Summarizingsource code with transferred API knowledge.

------------------ Original ------------------ From: Xing Hu <notifications@github.com> Date: Thu,Feb 18,2021 4:07 PM To: xing-hu/EMSE-DeepCom <EMSE-DeepCom@noreply.github.com> Cc: satinewee <3248777@qq.com>, Author <author@noreply.github.com> Subject: Re: [xing-hu/EMSE-DeepCom] Why the corpus_bleu is 0.0001? I have trained the model. (#23)

The final corpus bleu score is computed by multi-bleu.perl instead of nltk. Is the score computed by multi-bleu.perl is 0.000 too?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

satinewee commented 3 years ago

抱歉，此为上一篇邮件的附件，非常感谢您。

------------------ 原始邮件 ------------------ 发件人: "xing-hu/EMSE-DeepCom" <notifications@github.com>; 发送时间: 2021年2月18日(星期四) 下午2:41 收件人: "xing-hu/EMSE-DeepCom"<EMSE-DeepCom@noreply.github.com>; 抄送: "3248777"<3248777@qq.com>;"Author"<author@noreply.github.com>; 主题: Re: [xing-hu/EMSE-DeepCom] Why the corpus_bleu is 0.0001? I have trained the model. (#23)

can you share your code?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

xing-hu commented 3 years ago

长度有可能导致这个问题，但应该不会差这么多，能否把你现在用的数据集邮箱发我看一下（xinghu@zju.edu.cn）？

xing-hu / EMSE-DeepCom

Why the corpus_bleu is 0.0001? I have trained the model. #23