Closed p-baleine closed 8 years ago
@p-baleine, thanks a lot for your comment! Apparently something went wrong when I was getting rid of the excessive data in Cornell Movie-Dialogs Corpus. I'll try training with the normal order now. By the way, how long did it take you to get the results similar to macournoyer/neuralconvo?
I've trained macournoyer/neuralconvo's model for 20 epoch with 50000 examples. In tensorflow implementation, batch size(the default is 64) utterance are randomly picked up and used for training in one step. So I've executed 16000 steps(> 20 * 50000 / 64) and it took about 3 hours with GTX 1080.
Sorry but I don't remember the numbers exactly and if the above numbers are wrong I'll tell you tomorrow.
Gotcha, thank you! I'll close the issues for now, but you can keep commenting.
Sorry for my late reply. The details when I got a good result are below.
I've trained macournoyer/neuralconvo's sample with one layer of size 640 (the layer size was smaller because the GPU(GTX 660 Ti) did not have enough memory). Then I've trained tensorflow's sample with one layer of size 640 until 15000 step, i.e. I've executed the following command:
$ python translate.py --size=640 --num_layers=1 --train_dir=$(pwd)/model_layer_1_size_640_gru
It took about 3 hours and perplexity of bucket 0 was around 8, perplexity of bucket 0 was around 15. I got the best output in this setting.
I also tried with 2 layers of size 1024 but I've not got good outputs yet.
@p-baleine What is your vocabulary size? And what are the sizes of buckets?
Thank you for your useful repository about seq2seq chatbot.
I'm wondering why utterance of your learning data are reverse order. For example, in your data:
But actual utterance starts from “Can we make this quick?...” Are there any reason?
I've tried to train the tensorflow's translation model with [Cornell Movie-Dialogs Corpus](Cornell Movie-Dialogs Corpus) in normal order and got outputs like macournoyer/neuralconvo's ones.