Closed gorinars closed 2 years ago
I have already tested ADAM on this, convergence is much faster. But the change needs to be made in tensorflows seq2seq model. As you said, its simple: I copied seq2seq_model.py from tensorflow and changed the line with opt into:
opt = tf.train.AdamOptimizer(self.learning_rate)
The learning rate needs to be changed for ADAM, a good default is 0.001 Also, with ADAM you don't need the learning rate decay and you can remove it altogether or just set it to 1.0
Then you simply change the import in g2p.py:
from g2p_seq2seq import seq2seq_model
Folks say RMSProp might be even better for recurrent networks. I did not compare on this particular toolkit. By the way, a pull request is always welcomed ;)
SGD seems to converge slowly. Can we have an option for RMSProp, Adadelta and Adagrad? This should be easy to implement with the respective Tensorflow optimizers