Failed to restore with saved model instead of checkpoint

begeekmyfriend commented 5 years ago

Since saved model can be loaded in C API, I have been trying changing the checkpoint to saved model and found it failed to restore with such exceptions:

Traceback (most recent call last):
  File "/home/leoma/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
    return fn(*args)
  File "/home/leoma/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1317, in _run_fn
    self._extend_graph()
  File "/home/leoma/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1352, in _extend_graph
    tf_session.ExtendSession(self._session)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input 1 of node model/inference/decoder/while/Merge_1_1 was passed int32 from model/inference/decoder/while/NextIteration_1:0 incompatible with expected float.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/leoma/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1276, in restore
    {self.saver_def.filename_tensor_name: save_path})
  File "/home/leoma/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
    run_metadata_ptr)
  File "/home/leoma/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/leoma/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
    run_metadata)
  File "/home/leoma/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input 1 of node model/inference/decoder/while/Merge_1_1 was passed int32 from model/inference/decoder/while/NextIteration_1:0 incompatible with expected float.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "synthesize.py", line 98, in <module>
    main()
  File "synthesize.py", line 88, in main
    _ = tacotron_synthesize(args, hparams, taco_checkpoint, sentences)
  File "/home/leoma/Tacotron-2-wangdantong-22050/tacotron/synthesize.py", line 127, in tacotron_synthesize
    return run_eval(args, checkpoint_path, output_dir, hparams, sentences)
  File "/home/leoma/Tacotron-2-wangdantong-22050/tacotron/synthesize.py", line 57, in run_eval
    synth.load(checkpoint_path, hparams)
  File "/home/leoma/Tacotron-2-wangdantong-22050/tacotron/synthesizer.py", line 59, in load
    tf.saved_model.loader.load(self.session, [tag_constants.TRAINING], checkpoint_path)
  File "/home/leoma/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 324, in new_func
    return func(*args, **kwargs)
  File "/home/leoma/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/saved_model/loader_impl.py", line 269, in load
    return loader.load(sess, tags, import_scope, **saver_kwargs)
  File "/home/leoma/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/saved_model/loader_impl.py", line 421, in load
    self.restore_variables(sess, saver, import_scope)
  File "/home/leoma/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/saved_model/loader_impl.py", line 375, in restore_variables
    saver.restore(sess, self._variables_path)
  File "/home/leoma/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1312, in restore
    err, "a mismatch between the current graph and the graph")
tensorflow.python.framework.errors_impl.InvalidArgumentError: Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Input 1 of node model/inference/decoder/while/Merge_1_1 was passed int32 from model/inference/decoder/while/NextIteration_1:0 incompatible with expected float.

Here is my code in tacotron/train.py:

builder = tf.saved_model.builder.SavedModelBuilder(save_dir)
builder.add_meta_graph_and_variables(sess, [tag_constants.TRAINING], strip_default_attrs=True)
...
builder.save()

And in tacotron/synthesizer.py:

tf.saved_model.loader.load(self.session, [tag_constants.TRAINING], save_dir)

begeekmyfriend commented 5 years ago

Solved by substitute as follows

saver = tf.training.Saver()
saver.restore(self,session, os.path.join('logs-Tacotron', 'taco_pretrained', 'variables', 'variables')

It seems there are bugs in the load_graph method in saved_model.load method.

xiaoyangnihao commented 5 years ago

hi, @begeekmyfriend , I am wondering how do you fix this problem, can you show some detailed information, thanks

begeekmyfriend commented 5 years ago

I have worked around with it but not solved directly. So this issue is insignificant for me.

Rayhane-mamah / Tacotron-2

Failed to restore with saved model instead of checkpoint #343