Open UESTC-Liuxin opened 3 years ago
Hi, I have the same issue
@UESTC-Liuxin 你好我没有出现过具体问题,会不会跟你文字长度有关吶
Actually I had made a mistake in the loss, I fixed it shifting the ground-truth sequences to the right! (I changed a bit the loss function in my implementation, now the model is working well when predicting)
@charlesmindee great!
您好,我用我自己的数据集(汉英,真实场景160k数据量)进行实验,发现训练很快收敛,但是验证结果很差,您出现过这种情况吗?我想的话,这是不是因为这种结构和输入方式,相当于设置了teaching_forcing = 1,很容易就导致过拟合了。