HasnainRaz / FC-DenseNet-TensorFlow

Fully Convolutional DenseNet (A.K.A 100 layer tiramisu) for semantic segmentation of images implemented in TensorFlow.
MIT License
123 stars 41 forks source link

Infer fail due to model shape not match #13

Closed shouyinz closed 5 years ago

shouyinz commented 5 years ago

I'm trying to run train & infer. However I encounter error while infer

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [1,1,48,2] rhs shape= [1,1,80,2]
     [[Node: save/Assign_87 = Assign[T=DT_FLOAT, _class=["loc:@prediction/last_conv1x1/kernel"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](prediction/last_conv1x1/kernel, save/RestoreV2_87)]]

Could you provide some advise, Thanks

HasnainRaz commented 5 years ago

Can you share the complete error?

shouyinz commented 5 years ago
('First Convolution Out: ', TensorShape([Dimension(None), Dimension(256), Dimension(256), Dimension(48)]))
('Downsample Out:', TensorShape([Dimension(None), Dimension(128), Dimension(128), Dimension(80)]))
('Downsample Out:', TensorShape([Dimension(None), Dimension(64), Dimension(64), Dimension(128)]))
('Bottleneck Block: ', TensorShape([Dimension(None), Dimension(64), Dimension(64), Dimension(48)]))
('Upsample after concat: ', TensorShape([Dimension(None), Dimension(128), Dimension(128), Dimension(176)]))
('Upsample after concat: ', TensorShape([Dimension(None), Dimension(256), Dimension(256), Dimension(128)]))
('Mask Prediction: ', TensorShape([Dimension(None), Dimension(256), Dimension(256), Dimension(2)]))
2018-11-16 10:14:58.673542: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Traceback (most recent call last):
  File "main.py", line 44, in <module>
    main()
  File "main.py", line 40, in main
    tiramisu.infer(FLAGS.infer_data, FLAGS.batch_size, FLAGS.ckpt, FLAGS.output_folder)
  File "/Users/justin.tsai/Projects/FC-DenseNet-TensorFlow/model.py", line 378, in infer
    saver.restore(sess, ckpt.model_checkpoint_path)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1582, in restore
    err, "a mismatch between the current graph and the graph")
tensorflow.python.framework.errors_impl.InvalidArgumentError: Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Assign requires shapes of both tensors to match. lhs shape= [1,1,48,2] rhs shape= [1,1,80,2]
     [[node save/Assign_87 (defined at /Users/justin.tsai/Projects/FC-DenseNet-TensorFlow/model.py:375)  = Assign[T=DT_FLOAT, _class=["loc:@prediction/last_conv1x1/kernel"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](prediction/last_conv1x1/kernel, save/RestoreV2:87)]]

Caused by op u'save/Assign_87', defined at:
  File "main.py", line 44, in <module>
    main()
  File "main.py", line 40, in main
    tiramisu.infer(FLAGS.infer_data, FLAGS.batch_size, FLAGS.ckpt, FLAGS.output_folder)
  File "/Users/justin.tsai/Projects/FC-DenseNet-TensorFlow/model.py", line 375, in infer
    saver = tf.train.Saver()
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1102, in __init__
    self.build()
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1114, in build
    self._build(self._filename, build_save=True, build_restore=True)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1151, in _build
    build_save=build_save, build_restore=build_restore)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 795, in _build_internal
    restore_sequentially, reshape)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 428, in _AddRestoreOps
    assign_ops.append(saveable.restore(saveable_tensors, shapes))
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 119, in restore
    self.op.get_shape().is_fully_defined())
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/state_ops.py", line 221, in assign
    validate_shape=validate_shape)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_state_ops.py", line 61, in assign
    use_locking=use_locking, name=name)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
    op_def=op_def)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1770, in __init__
    self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Assign requires shapes of both tensors to match. lhs shape= [1,1,48,2] rhs shape= [1,1,80,2]
     [[node save/Assign_87 (defined at /Users/justin.tsai/Projects/FC-DenseNet-TensorFlow/model.py:375)  = Assign[T=DT_FLOAT, _class=["loc:@prediction/last_conv1x1/kernel"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](prediction/last_conv1x1/kernel, save/RestoreV2:87)]]

python 2.7.15 tensorflow 1.12.0

HasnainRaz commented 5 years ago

It seems that the graph definition has changed from train time. Did you use the same parameters (growth_k, layers per block) as you did during training? I cannot reproduce this. Please try retraining and using the same parameters (growth, layers_per_block etc) on infer, your graph definition should be the same during train and infer.

Closing until confirmed, since I can't reproduce.