ncoudray / DeepPATH

Classification of Lung cancer slide images using deep-learning
492 stars 213 forks source link

Assign requires shapes of both tensors to match. lhs shape= [2048,3] rhs shape= [2048,4] #23

Closed re1nth closed 5 years ago

re1nth commented 5 years ago

Hello, When I test the data using : python3 /home/revanth/DeepPATH/DeepPATH_code/02_testing/xClasses/nc_imagenet_eval.py --checkpoint_dir='/home/revanth/training' --eval_dir='/home/revanth/output_data' --data_dir="/home/revanth/test" --batch_size=10 --ImageSet_basename='test_' --run_once --ClassNumber 2 --mode='0_softmax' --TVmode='test'

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [2048,3] rhs shape= [2048,4] [[Node: save/Assign_31 = Assign[T=DT_FLOAT, _class=["loc:@logits/logits/weights"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](logits/logits/weights, save/RestoreV2:31)]]

During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/revanth/DeepPATH/DeepPATH_code/02_testing/xClasses/nc_imagenet_eval.py", line 268, in tf.app.run() File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 125, in run _sys.exit(main(argv)) File "/home/revanth/DeepPATH/DeepPATH_code/02_testing/xClasses/nc_imagenet_eval.py", line 64, in main precision_at_1, current_score = nc_inception_eval.evaluate(dataset) File "/home/revanth/DeepPATH/DeepPATH_code/02_testing/xClasses/inception/nc_inception_eval.py", line 411, in evaluate precision_at_1, current_score = _eval_once(saver, summary_writer, top_1_op, top_5_op, summary_op, max_percent, all_filenames, filename_queue, net2048, sel_end_points, logits, labels) File "/home/revanth/DeepPATH/DeepPATH_code/02_testing/xClasses/inception/nc_inception_eval.py", line 72, in _eval_once saver.restore(sess, ckpt.model_checkpoint_path) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1759, in restore err, "a mismatch between the current graph and the graph") tensorflow.python.framework.errors_impl.InvalidArgumentError: Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Assign requires shapes of both tensors to match. lhs shape= [2048,3] rhs shape= [2048,4] [[Node: save/Assign_31 = Assign[T=DT_FLOAT, _class=["loc:@logits/logits/weights"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](logits/logits/weights, save/RestoreV2:31)]]

Caused by op 'save/Assign_31', defined at: File "/home/revanth/DeepPATH/DeepPATH_code/02_testing/xClasses/nc_imagenet_eval.py", line 268, in tf.app.run() File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 125, in run _sys.exit(main(argv)) File "/home/revanth/DeepPATH/DeepPATH_code/02_testing/xClasses/nc_imagenet_eval.py", line 64, in main precision_at_1, current_score = nc_inception_eval.evaluate(dataset) File "/home/revanth/DeepPATH/DeepPATH_code/02_testing/xClasses/inception/nc_inception_eval.py", line 402, in evaluate saver = tf.train.Saver(variables_to_restore) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1281, in init self.build() File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1293, in build self._build(self._filename, build_save=True, build_restore=True) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1330, in _build build_save=build_save, build_restore=build_restore) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 778, in _build_internal restore_sequentially, reshape) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 419, in _AddRestoreOps assign_ops.append(saveable.restore(saveable_tensors, shapes)) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 112, in restore self.op.get_shape().is_fully_defined()) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/state_ops.py", line 216, in assign validate_shape=validate_shape) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_state_ops.py", line 60, in assign use_locking=use_locking, name=name) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 454, in new_func return func(*args, **kwargs) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3155, in create_op op_def=op_def) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1717, in init self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Assign requires shapes of both tensors to match. lhs shape= [2048,3] rhs shape= [2048,4] [[Node: save/Assign_31 = Assign[T=DT_FLOAT, _class=["loc:@logits/logits/weights"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](logits/logits/weights, save/RestoreV2:31)]]

I tried removing all the partially trained models, tried training with transfer learning and without too. When it comes to testing I get the above mentioned error. Do you know where the issue is?

ncoudray commented 5 years ago

Hi - Sorry, I've never seen this issue. Maybe double-check all your inputs, specially the TFRecord images you used. Before you convert JPGs to TFRecord, you should have 1 sub-directory per class only - not less, not more, no other sub-directory. If you have 2 classes, you should have had 2 sub-directories... etc... check things like that at every step.