gorinars commented 8 years ago

SGD seems to converge slowly. Can we have an option for RMSProp, Adadelta and Adagrad? This should be easy to implement with the respective Tensorflow optimizers

bmilde commented 8 years ago

I have already tested ADAM on this, convergence is much faster. But the change needs to be made in tensorflows seq2seq model. As you said, its simple: I copied seq2seq_model.py from tensorflow and changed the line with opt into:

opt = tf.train.AdamOptimizer(self.learning_rate)

The learning rate needs to be changed for ADAM, a good default is 0.001 Also, with ADAM you don't need the learning rate decay and you can remove it altogether or just set it to 1.0

Then you simply change the import in g2p.py:

from tensorflow.models.rnn.translate import seq2seq_model

from g2p_seq2seq import seq2seq_model

gorinars commented 8 years ago

Folks say RMSProp might be even better for recurrent networks. I did not compare on this particular toolkit. By the way, a pull request is always welcomed ;)

cmusphinx / g2p-seq2seq

Implement and test adaptive training alogithms #29

from tensorflow.models.rnn.translate import seq2seq_model