Closed vineetjohn closed 6 years ago
Trying tf.contrib.seq2seq.GreedyEmbeddingHelper
UPDATE: Doesn't help
Trying Tensorflow NMT's strategy of using EOS instead of PAD to pad shorter sequences.
UPDATE: Doesn't help
It seems like testing with larger decoder lengths is a good strategy. However the sequence loss method doesn't work for early training output that finishes before the input sequence. Fix to try: Need to clip the ground truth to the max time steps in a particular minibatch.
UPDATE: Doesn't work for inference
Incrementing sequence lengths by 1 seems to train on eos tokens as well, for GreedyEmbeddingHelper, and helps predict variable length sentences. Works for beam search too.
No EOS (end of sentence) tokens are being predicted by the decoder.