matterport / Mask_RCNN

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow
Other
24.66k stars 11.7k forks source link

after adjusting anchors for large object indices out of range? #864

Open CallumMoscrip opened 6 years ago

CallumMoscrip commented 6 years ago

I am having issues training on my own dataset, I have 4 classes and the images are 640 x 640 which are then padded up to 1024 using "square". the problem is that i have adjusted the Backbone strides and the RPN Anchor scales to suit my data, this is made up of large objects approx 300-500 pixels square. once the network is trained i cannot evaluate it, when i use inspect model it tells me that i have 9666 anchors and i get this error message.

InvalidArgumentError Traceback (most recent call last)

in () 5 dataset.image_reference(image_id))) 6 # Run object detection ----> 7 results = model.detect([image], verbose=1) 8 9 # Display results /host/Mask_RCNN/mrcnn/model.py in detect(self, images, verbose) 2529 # Run object detection 2530 detections, _, _, mrcnn_mask, _, _, _ =\ -> 2531 self.keras_model.predict([molded_images, image_metas, anchors], verbose=0) 2532 # Process detections 2533 results = [] /usr/local/lib/python3.5/dist-packages/keras/engine/training.py in predict(self, x, batch_size, verbose, steps) 1165 batch_size=batch_size, 1166 verbose=verbose, -> 1167 steps=steps) 1168 1169 def train_on_batch(self, x, y, /usr/local/lib/python3.5/dist-packages/keras/engine/training_arrays.py in predict_loop(model, f, ins, batch_size, verbose, steps) 292 ins_batch[i] = ins_batch[i].toarray() 293 --> 294 batch_outs = f(ins_batch) 295 batch_outs = to_list(batch_outs) 296 if batch_index == 0: /usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py in __call__(self, inputs) 2664 return self._legacy_call(inputs) 2665 -> 2666 return self._call(inputs) 2667 else: 2668 if py_any(is_tensor(x) for x in inputs): /usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py in _call(self, inputs) 2634 symbol_vals, 2635 session) -> 2636 fetched = self._callable_fn(*array_vals) 2637 return fetched[:len(self.outputs)] 2638 /usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py in __call__(self, *args) 1452 else: 1453 return tf_session.TF_DeprecatedSessionRunCallable( -> 1454 self._session._session, self._handle, args, status, None) 1455 1456 def __del__(self): /usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors_impl.py in __exit__(self, type_arg, value_arg, traceback_arg) 517 None, None, 518 compat.as_text(c_api.TF_Message(self.status.status)), --> 519 c_api.TF_GetCode(self.status.status)) 520 # Delete the underlying status object from memory otherwise it stays alive 521 # as there is a reference to status from this from the traceback due to InvalidArgumentError: indices[0] = 62091 is not in [0, 9666) my config looks like this: Configurations: BACKBONE resnet101 BACKBONE_STRIDES [13, 19, 25, 32, 38] BATCH_SIZE 1 BBOX_STD_DEV [0.1 0.1 0.2 0.2] COMPUTE_BACKBONE_SHAPE None DETECTION_MAX_INSTANCES 100 DETECTION_MIN_CONFIDENCE 0.7 DETECTION_NMS_THRESHOLD 0.3 FPN_CLASSIF_FC_LAYERS_SIZE 1024 GPU_COUNT 1 GRADIENT_CLIP_NORM 5.0 IMAGES_PER_GPU 1 IMAGE_MAX_DIM 1024 IMAGE_META_SIZE 16 IMAGE_MIN_DIM 800 IMAGE_MIN_SCALE 0 IMAGE_RESIZE_MODE square IMAGE_SHAPE [1024 1024 3] LEARNING_MOMENTUM 0.9 LEARNING_RATE 0.001 LOSS_WEIGHTS {'mrcnn_bbox_loss': 1.0, 'rpn_class_loss': 1.0, 'rpn_bbox_loss': 1.0, 'mrcnn_class_loss': 1.0, 'mrcnn_mask_loss': 1.0} MASK_POOL_SIZE 14 MASK_SHAPE [28, 28] MAX_GT_INSTANCES 100 MEAN_PIXEL [123.7 116.8 103.9] MINI_MASK_SHAPE (56, 56) NAME raspberry NUM_CLASSES 4 POOL_SIZE 7 POST_NMS_ROIS_INFERENCE 1000 POST_NMS_ROIS_TRAINING 2000 ROI_POSITIVE_RATIO 0.33 RPN_ANCHOR_RATIOS [0.75, 1, 1.5] RPN_ANCHOR_SCALES (208, 304, 400, 512, 608) RPN_ANCHOR_STRIDE 2 RPN_BBOX_STD_DEV [0.1 0.1 0.2 0.2] RPN_NMS_THRESHOLD 0.9 RPN_TRAIN_ANCHORS_PER_IMAGE 256 STEPS_PER_EPOCH 20 TOP_DOWN_PYRAMID_SIZE 256 TRAIN_BN False TRAIN_ROIS_PER_IMAGE 40 USE_MINI_MASK True USE_RPN_ROIS True VALIDATION_STEPS 50 WEIGHT_DECAY 0.0001 any help greatly appreciated
pstalidis commented 6 years ago

I have a similar issue. Changing RPN_ANCHOR_SCALES produces 261120 anchors. But the saved model (finetuned from mscoco with the same config) searches for 261888 anchors. Could it be that the pretrained RPN model produces anchors for RPN_ANCHOR_SCALES=(32, 64, 128, 256, 512) ?

kenanozturk commented 6 years ago

Hi. I have the same problem. Could you find any solutions @CallumMoscrip @pstalidis ??

pstalidis commented 6 years ago

@kenanozturk I trained the model from scratch.