Decoding is currently done with a greedy max for each decoding step. This will (severely) limit the performance of the model when you use it, as only the probability of the next phoneme character is maximized and not the probability of the whole phoneme sequence. Using beam search, a better log probability for the whole phoneme sequence could be found. I suggest to implement beam search by changing Tensorflows seq2seq decoding so that the decoding can be done step wise and then the beam search can be handled in Python.
Decoding is currently done with a greedy max for each decoding step. This will (severely) limit the performance of the model when you use it, as only the probability of the next phoneme character is maximized and not the probability of the whole phoneme sequence. Using beam search, a better log probability for the whole phoneme sequence could be found. I suggest to implement beam search by changing Tensorflows seq2seq decoding so that the decoding can be done step wise and then the beam search can be handled in Python.
This Tensorflow issue is related https://github.com/tensorflow/tensorflow/issues/654, some comments shared code of extended implementations with beam search.