Closed shreymohan closed 4 years ago
Hi @shreymohan looks like your data_format is wrong: one is channels_first, another is channels_last. Could you double check your command lines?
Hey! Thank you so much for replying. data_format is set to channels_last in hparams.
This is the cls_output I get from utils.build_model_with_precision method. I am fine-tuning on the UA-Detrac dataset which has 4 classes: cls outputs {3: <tf.Tensor 'class_net/class-predict/BiasAdd:0' shape=(12, 64, 64, 36) dtype=float32>, 4: <tf.Tensor 'class_net/class-predict_1/BiasAdd:0' shape=(12, 32, 32, 36) dtype=float32>, 5: <tf.Tensor 'class_net/class-predict_2/BiasAdd:0' shape=(12, 16, 16, 36) dtype=float32>, 6: <tf.Tensor 'class_net/class-predict_3/BiasAdd:0' shape=(12, 8, 8, 36) dtype=float32>, 7: <tf.Tensor 'class_net/class-predict_4/BiasAdd:0' shape=(12, 4, 4, 36) dtype=float32>}
This class outputs is sent to the detection_loss method in det_model_fn.py file.
This is the cls_output I get while fine-tuning on VOC: cls outputs {3: <tf.Tensor 'class_net/class-predict/BiasAdd:0' shape=(12, 64, 64, 180) dtype=float32>, 4: <tf.Tensor 'class_net/class-predict_1/BiasAdd:0' shape=(12, 32, 32, 180) dtype=float32>, 5: <tf.Tensor 'class_net/class-predict_2/BiasAdd:0' shape=(12, 16, 16, 180) dtype=float32>, 6: <tf.Tensor 'class_net/class-predict_3/BiasAdd:0' shape=(12, 8, 8, 180) dtype=float32>, 7: <tf.Tensor 'class_net/class-predict_4/BiasAdd:0' shape=(12, 4, 4, 180) dtype=float32>}
Any kind of help would be very appreciated. Any thoughts @fsx950223
These are the shapes of logits and targets in the focal loss method when the model is initialized:
logits: Tensor("class_net/class-predict/BiasAdd:0", shape=(12, 64, 64, 36), dtype=float32) targets Tensor("Reshape:0", shape=(12, 64, 64, 36), dtype=float32)
logits: Tensor("class_net/class-predict_1/BiasAdd:0", shape=(12, 32, 32, 36), dtype=float32) targets Tensor("Reshape_2:0", shape=(12, 32, 32, 36), dtype=float32)
logits: Tensor("class_net/class-predict_2/BiasAdd:0", shape=(12, 16, 16, 36), dtype=float32) targets Tensor("Reshape_4:0", shape=(12, 16, 16, 36), dtype=float32)
logits: Tensor("class_net/class-predict_3/BiasAdd:0", shape=(12, 8, 8, 36), dtype=float32) targets Tensor("Reshape_6:0", shape=(12, 8, 8, 36), dtype=float32)
logits: Tensor("class_net/class-predict_4/BiasAdd:0", shape=(12, 4, 4, 36), dtype=float32) targets Tensor("Reshape_8:0", shape=(12, 4, 4, 36), dtype=float32)
But when actual data is loaded for training, it gives this error: tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: Incompatible shapes: [12,180,64,64] vs. [12,36,64,64] [[{{node focal_loss/logistic_loss/GreaterEqual}}]] [[strided_slice_3/_8013]] (1) Invalid argument: Incompatible shapes: [12,180,64,64] vs. [12,36,64,64] [[{{node focal_loss/logistic_loss/GreaterEqual}}]]
Maybe there is something wrong in the way I generated tfrecords?
It's a very rudimentary problem, you should use new ckpt path before training new model.
@fsx950223 I am using coco ckpt for efficientdet=d0 from this repo .
@fsx950223 I am using coco ckpt for efficientdet=d0 from this repo .
Where does your ckpt store?
Thank you so much! I was pointing model_dir to an old path. It starts to train now, thanks again for the cue.
I get the following error message while training EffiecientDet-d0:
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: Incompatible shapes: [12,180,64,64] vs. [12,36,64,64] [[node focal_loss/logistic_loss/GreaterEqual (defined at /home/shrey/work_safety/vehicle_detection/train_val/automl/efficientdet/det_model_fn.py:165) ]] [[strided_slice_3/_8013]] (1) Invalid argument: Incompatible shapes: [12,180,64,64] vs. [12,36,64,64] [[node focal_loss/logistic_loss/GreaterEqual (defined at /home/shrey/work_safety/vehicle_detection/train_val/automl/efficientdet/det_model_fn.py:165) ]] 0 successful operations. 0 derived errors ignored.
@mingxingtan any suggestions? Using latest code and TF version 2.2.0-rc4