jiangxiluning / MASTER-TF

MASTER
MIT License
139 stars 44 forks source link

The training phase converges quickly (acc>0.95), but the validate result is very bad (acc<0.3) #15

Open UESTC-Liuxin opened 3 years ago

UESTC-Liuxin commented 3 years ago

您好,我用我自己的数据集(汉英,真实场景160k数据量)进行实验,发现训练很快收敛,但是验证结果很差,您出现过这种情况吗?我想的话,这是不是因为这种结构和输入方式,相当于设置了teaching_forcing = 1,很容易就导致过拟合了。

charlesmindee commented 3 years ago

Hi, I have the same issue

jiangxiluning commented 3 years ago

@UESTC-Liuxin 你好我没有出现过具体问题,会不会跟你文字长度有关吶

charlesmindee commented 3 years ago

Actually I had made a mistake in the loss, I fixed it shifting the ground-truth sequences to the right! (I changed a bit the loss function in my implementation, now the model is working well when predicting)

jiangxiluning commented 3 years ago

@charlesmindee great!