Open Lldenaurois opened 7 years ago
Hello,
How long did you train it for? This is something I had before. I found it occurs during early training stages. I found the EOS token to disappear (mostly) after about 24 hours on a Titan X. Though I suspect to train a 'good' model requires significantly more training.
Hey man,
Thanks for the response. I trained for less than 12 hours.
I wonder if maybe there's a tiny bug in your code, where you maybe end each conversation with a newline. This would lead the model to have these examples where outputting the empty sentence was most likely. I can look into it!
Great work on the repo though!
That's very possible there is a bug like that. If you want to look into it and find a bug, feel free to submit a pr.
Hi there,
I implemented your code in Tensroflow r1.1 and I am able to train the entire model.
When I then attempt to get a sample, I simply get a constant output [ 2, 2, 2, 2, 2 ]
This means that the trained model outputs _EOS every time.
Any ideas as to why this is happening?