Closed devjwsong closed 3 years ago
Hi, thank you for your attention on my work. I think you can find two repos on the GitHub which contains the ReCoSa model. The first is the author's, which is written by the tensorflow, the other is mine. Before I re-implement my version of the ReCoSa, I try to run the original version, but it occur the same issue as yours. After carefully analyzing the structure of the ReCoSa, I think it's about the Transformer decode. After I replace the transformer masked decoder with the standard RNN (GRU) decoder, the performance seems fine. I recommand that you can try to change the structure of the ReCoSa on the fly.
If you have any issues, feel free to contact with me.
GMFTBY | |
---|---|
18811371908@163.com | 签名由网易邮箱大师定制 On 10/28/2020 14:18,Jaewoo Songnotifications@github.com wrote:
Hi, Thank you for publishing this great repository. I opened this issue because I want to ask a question to you.
I'm currently trying to implement a multi-turn dialogue generation task using ReCoSa structure. I coded the entire structure by referring the paper and I combined 4 datasets, DailyDialog, PersonaChat, EmpatheticDialogue and BlendedSkillTalk. But after training, I have not been able to get satisfactory results since it makes the outputs with no meaning or severely repetitive words in the inference step. I've been trying to improve it by changing the hyperparameter setting several times and training again and gain, but I still can't get good results.
I want to know why mine is not working...maybe due to wrong hyperparameter settings, problems with implementation itself, or lack of data... The most likely problem to me at this moment is the sequence length since I've set it very long at 300, so I think there will be difficulties in encoding an utterance but I'm not sure.
So I wonder if you got decent qualities with ReCoSa model, not the automatic scores but the actual engaging conversations. (even if your structure seems slightly different from the original version in the paper...) As I look at your codes, my model has bigger dimensions and complexity, but I don't think this is the cause since even the overfitting is not happening.
Please tell me what your results looked like. That will be really grateful. Thank you.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.
Oh, that is not what I have thought about. I will look into that!
Thank you very much.
Hi, Thank you for publishing this great repository. I opened this issue because I want to ask a question to you.
I'm currently trying to implement a multi-turn dialogue generation task using ReCoSa structure. I coded the entire structure by referring the paper and I combined 4 datasets, DailyDialog, PersonaChat, EmpatheticDialogue and BlendedSkillTalk. But after training, I have not been able to get satisfactory results since it makes the outputs with no meaning or severely repetitive words in the inference step. I've been trying to improve it by changing the hyperparameter setting several times and training again and gain, but I still can't get good results.
I want to know why mine is not working...maybe due to wrong hyperparameter settings, problems with implementation itself, or lack of data... The most likely problem to me at this moment is the sequence length since I've set it very long at 300, so I think there will be difficulties in encoding an utterance but I'm not sure.
So I wonder if you got decent qualities with ReCoSa model, not the automatic scores but the actual engaging conversations. (even if your structure seems slightly different from the original version in the paper...) As I look at your codes, my model has bigger dimensions and complexity, but I don't think this is the cause since even the overfitting is not happening.
Please tell me what your results looked like. That will be really grateful. Thank you.