Open doubler opened 6 years ago
What do you mean by non-stable output?
In inference (decoding), dropout is always turned off (i.e. set to 0).
Setting dropout to 1.0 makes no sense - it would ignore all input during training. If you want to turn off dropout during training use --hparams="layer_prepostprocess_dropout=0,attention_dropout=0,relu_dropout=0"
depending on which types of dropout you want to turn off. However, this is not recommended for the best results unless you have really huge training data or some other technique to prevent overfitting.
@martinpopel I just use the encoder in a library way. I didn't use the decoder. I use the output of the encoder, after training, it still has dropout.
The dropout should be automatically turned off both in encoder and decoder during inference because all hyperparameters ending in "dropout" are automatically set to 0.0 when not in training mode.
encoder = transformer.TransformerEncoder(hparams, mode=tf.estimator.ModeKeys.PREDICT)
You have the mode set to TRAIN
which will have dropout enabled.
@rsepassi @martinpopel I think the tensor2tensor does not provide a good way to be used as a library. After training, I just want to load the model and use it as a service. I don't want to copy the model network construct code only for changing the mode param.
You may be interested in the SavedModel
format that export.py
outputs. That's the format we use to actually serve the models.
@rsepassi I simply use saver = tf.train.Saver(tf.global_variables())
saver.save(sess, checkpoint_file)
to save model,
andsaver = tf.train.import_meta_graph("{}.meta".format(checkpoint_file)
saver.restore(sess, checkpoint_file)
to load model for inference. Then encoder looks like not support this way.
I use the transformer encoder code like below to train and dump the model.
And load the model like
saver.restore(sess, checkpoint_file)
. While I found the model output is not stable. I think maybe it's because the dropout. How can I set the dropout to 1.0 in my feed_dict?