bourdakos1 / Custom-Object-Detection

Custom Object Detection with TensorFlow
https://medium.freecodecamp.org/tracking-the-millenium-falcon-with-tensorflow-c8c86419225e
MIT License
347 stars 181 forks source link

InvalidArgumentError: Assign requires shapes of both tensors to match. #10

Closed Kongsea closed 6 years ago

Kongsea commented 6 years ago

I only created the tfrecord files using my own dataset and changed num_classes in faster_rcnn_resnet101.config accordingly.

Then when I run the code, it raised the following error:

Caused by op u'save_1/Assign_815', defined at: File "object_detection/train.py", line 198, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "object_detection/train.py", line 194, in main worker_job_name, is_chief, FLAGS.train_dir) File "/Custom-Object-Detection/object_detection/trainer.py", line 281, in train keep_checkpoint_every_n_hours=keep_checkpoint_every_n_hours) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1218, in init self.build() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1227, in build self._build(self._filename, build_save=True, build_restore=True) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1263, in _build build_save=build_save, build_restore=build_restore) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 751, in _build_internal restore_sequentially, reshape) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 439, in _AddRestoreOps assign_ops.append(saveable.restore(tensors, shapes)) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 160, in restore self.op.get_shape().is_fully_defined()) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/state_ops.py", line 276, in assign validate_shape=validate_shape) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_state_ops.py", line 57, in assign use_locking=use_locking, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2956, in create_op op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1470, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [584] rhs shape= [8] [[Node: save_1/Assign_815 = Assign[T=DT_FLOAT, _class=["loc:@SecondStageBoxPredictor/BoxEncodingPredictor/biases"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](SecondStageBoxPredictor/BoxEncodingPredictor/biases/Momentum, save_1/RestoreV2_815)]]

It seems the model was restored failed. Besides, my own dataset has 146 classes, so it seems 584 = 146 4 is not equal the original 2 classes 4 = 8.

bourdakos1 commented 6 years ago

Hmm, did you create new tf records first? Maybe it’s still pointing at the old one?

Kongsea commented 6 years ago

I have deleted all the old tf records and created the new ones for my dataset. Finally, I found it's because I missed to change the class number in some place. Now it's OK after I change it. Thank you.

bourdakos1 commented 6 years ago

Awesome :)

gingerhead22 commented 6 years ago

@Kongsea Hi, Kongsea, I got the exact same problem like yours. WOuld you mind to let me know where else we need to change the class# beside .config file? Thank you

Kongsea commented 6 years ago

Search the original class number 2 and corresponding bbox coordinates number 8 [ 2*4 ] and replace the two numbers with numbers corresponding to your dataset.

gingerhead22 commented 6 years ago

Hi, thank you for the quick response. What are the specific variable names? I didn't find original_class_number or bbox_coordinates_number. Thank you.

Kongsea commented 6 years ago

I mean to search the number 2 and 8, and replace them respectively. I am sorry I cannot remeber the specific parameters, so you need to search them yourself. Be careful to replace the numbers related to the class number and the bbox coordinates number only.

raosushant commented 6 years ago

TF creates a "checkpoint" file. There might be one provided with the frozen inference graph, try deleting that and it should fix the problem.

ultrasanity commented 5 years ago

@gingerhead22 @Kongsea were you able to find the specific parameters that you had to update? If so, can you list them?