endernewton / tf-faster-rcnn

Tensorflow Faster RCNN for Object Detection
https://arxiv.org/pdf/1702.02138.pdf
MIT License
3.65k stars 1.57k forks source link

Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error: #454

Closed henbucuoshanghai closed 5 years ago

henbucuoshanghai commented 5 years ago

when i training on my data,error like this,why???? ci bus id: 0000:65:00.0, compute capability: 7.5) Loading model check point from output/res101/voc_2007_trainval/default/res101_faster_rcnn_iter_70000.ckpt Traceback (most recent call last): File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1278, in _do_call return fn(*args) File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1263, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1350, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [2048,16] rhs shape= [2048,84] [[Node: save/Assign_1 = Assign[T=DT_FLOAT, _class=["loc:@resnet_v1_101/bbox_pred/weights"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](resnet_v1_101/bbox_pred/weights, save/RestoreV2/_3)]] [[Node: save/RestoreV2/_196 = _SendT=DT_FLOAT, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_202_save/RestoreV2", _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1725, in restore {self.saver_def.filename_tensor_name: save_path}) File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 877, in run run_metadata_ptr) File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1100, in _run feed_dict_tensor, options, run_metadata) File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1272, in _do_run run_metadata) File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1291, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [2048,16] rhs shape= [2048,84] [[Node: save/Assign_1 = Assign[T=DT_FLOAT, _class=["loc:@resnet_v1_101/bbox_pred/weights"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](resnet_v1_101/bbox_pred/weights, save/RestoreV2/_3)]] [[Node: save/RestoreV2/_196 = _SendT=DT_FLOAT, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_202_save/RestoreV2", _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

Caused by op 'save/Assign_1', defined at: File "./tools/test_net.py", line 112, in saver = tf.train.Saver() File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1281, in init self.build() File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1293, in build self._build(self._filename, build_save=True, build_restore=True) File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1330, in _build build_save=build_save, build_restore=build_restore) File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 778, in _build_internal restore_sequentially, reshape) File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 419, in _AddRestoreOps assign_ops.append(saveable.restore(saveable_tensors, shapes)) File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 112, in restore self.op.get_shape().is_fully_defined()) File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/state_ops.py", line 216, in assign validate_shape=validate_shape) File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_state_ops.py", line 60, in assign use_locking=use_locking, name=name) File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 454, in new_func return func(*args, **kwargs) File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3155, in create_op op_def=op_def) File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1717, in init self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [2048,16] rhs shape= [2048,84] [[Node: save/Assign_1 = Assign[T=DT_FLOAT, _class=["loc:@resnet_v1_101/bbox_pred/weights"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](resnet_v1_101/bbox_pred/weights, save/RestoreV2/_3)]] [[Node: save/RestoreV2/_196 = _SendT=DT_FLOAT, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_202_save/RestoreV2", _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "./tools/test_net.py", line 113, in saver.restore(sess, args.model) File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1759, in restore err, "a mismatch between the current graph and the graph") tensorflow.python.framework.errors_impl.InvalidArgumentError: Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Assign requires shapes of both tensors to match. lhs shape= [2048,16] rhs shape= [2048,84] [[Node: save/Assign_1 = Assign[T=DT_FLOAT, _class=["loc:@resnet_v1_101/bbox_pred/weights"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](resnet_v1_101/bbox_pred/weights, save/RestoreV2/_3)]] [[Node: save/RestoreV2/_196 = _SendT=DT_FLOAT, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_202_save/RestoreV2", _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

Caused by op 'save/Assign_1', defined at: File "./tools/test_net.py", line 112, in saver = tf.train.Saver() File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1281, in init self.build() File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1293, in build self._build(self._filename, build_save=True, build_restore=True) File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1330, in _build build_save=build_save, build_restore=build_restore) File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 778, in _build_internal restore_sequentially, reshape) File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 419, in _AddRestoreOps assign_ops.append(saveable.restore(saveable_tensors, shapes)) File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 112, in restore self.op.get_shape().is_fully_defined()) File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/state_ops.py", line 216, in assign validate_shape=validate_shape) File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_state_ops.py", line 60, in assign use_locking=use_locking, name=name) File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 454, in new_func return func(*args, **kwargs) File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3155, in create_op op_def=op_def) File "/home/li/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1717, in init self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Assign requires shapes of both tensors to match. lhs shape= [2048,16] rhs shape= [2048,84] [[Node: save/Assign_1 = Assign[T=DT_FLOAT, _class=["loc:@resnet_v1_101/bbox_pred/weights"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](resnet_v1_101/bbox_pred/weights, save/RestoreV2/_3)]] [[Node: save/RestoreV2/_196 = _SendT=DT_FLOAT, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_202_save/RestoreV2", _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

henbucuoshanghai commented 5 years ago

Loading model check point from output/res101/voc_2007_trainval/default/res101_faster_rcnn_iter_70000.ckpt

res101_faster_rcnn_iter_70000.ckpt has 20 classes,train my data ,only 4 classes,so it is error but train data,why loading this model?????????????

devendraswamy commented 4 years ago

I am also getting same error , could you solve this error , please help me .