Closed shrango closed 1 year ago
@shrango many thanks for pointing this out, I will fix this in the next commit.
Hi, a few days ago I pulled a request to fix the incompatibility in the training stage. However, I find incompatibilities in the inference stage too. 你好,我几天前提交版本的代码仅保证了训练阶段正常运行,但是测试阶段仍存在代码版本不兼容的问题
The newest pr(8c2b6eef192da319328cdce346e9f802b4272269) seems to be running. Yet I have no idea what the last parameter(True) that passed in means, in transformer/LD/Decoder.py line 173 and line 207:
out, _state = net(inputu, inputhu, (None, None,), src_pad_mask, chk_pad_mask, None, out, True)
最新提交的版本已经能让测试阶段成功运行,但是这个版本中发现了一个冗余的参数,我不知道它有没有具体的意义。
As you see, the implementation of Decoderlayer.forward in line 36 takes only 8 parameters originally. Yet 9 parameters are passed in when calling the forward function. For fixing it, I just add an unknown_meaning
parameter to enable the script to run smoothly.
Decoderlayer的forward函数中只写了8个待传递参数,但是实际调用的时候却传入了9个,多了最后的一个True。我目前的改法是把函数定义里加了一个形参unknown_meaning
,目前代码能正常运行。
However, the result turns out to be not ideal, which yields SacreBLEU of 25.4623 for WMT14 EN-DE task (using 'LD' setting but without Attentive phrase attention). So I am wondering if something important is not noticed, which is probably because of the unknown_meaning parameter. 但是测试结果不太理想,在WMT14 EN-DE任务上的SacreBLEU只有25.4623,使用的是LD的模型,但是没有加Attn设定。我猜我可能忽略了某些细节,比如那个不知道具体意义的参数。
Hi, I came across some errors when reproducing 'LD(Learning Source Phrase Representation for NMT)' approach. I examined and remedied the parameters in these functions and successfully run the model. 您好,我在复现LD方法的时候发现代码中存在一些参数传递的错误,这可能是兼容多个方法带来的问题。经过修改之后已经可以正常运行LD训练代码了。