matterport / Mask_RCNN

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow
Other
24.67k stars 11.71k forks source link

FPN, RPN config for small objects and low resolution images #1166

Open dmitryshendryk opened 5 years ago

dmitryshendryk commented 5 years ago

Hello everyone.

I have a use case to train model on small images with shape (80, 140). Objects, that i want to detect on these images are even more small, it's a characters.

Already implemented a lot of steps data augmentation like blur, rotation, flip and so on. This is my config params i've redefined

IMAGES_PER_GPU = 2

NUM_CLASSES = 1 + 33

STEPS_PER_EPOCH = 100

DETECTION_MIN_CONFIDENCE = 0.9

RPN_ANCHOR_SCALES = (16, 32, 48, 64, 88)

RPN_ANCHOR_RATIOS = [0.2, 0.5, 1]

IMAGE_MIN_DIM = int(128)
IMAGE_MAX_DIM = int(320)

Other params are default.

The model can detect characters, for now not very good, but there should be space for improvements.

So my question is, what is the best config will be for the FPN, RPN for the small images and small objects detection? I think need to adjust these or maybe other hyper params.

Can you share any best practices for the Mask R-CNN config for such use case?

ashnair1 commented 5 years ago

@dmitryshendryk Not an answer to your question but I had a question regarding your augmentation. When you applied your augmentations did you happen to change your annotations such that it matches your augmented image? If so how?

For example, if I wanted to detect a building in the top half of the image and the image is rotated 90 degrees clockwise, then the building will be on the right side of the augmented image. But the annotations will still say it's on the top half of the image. Wouldn't this cause problems? Or is it taken into account and changed internally?

dmitryshendryk commented 5 years ago

@ash1995 When I rotate the image and do augmentation, i also recalculate all points of the boundary box, so no problem in it.

ashnair1 commented 5 years ago

@dmitryshendryk Is the recalculation of the bounding box and segmentation masks done by you via a function you wrote yourself or is this recalculation possible from within the project i.e. is there a function that does the recalculation within the Mask RCNN project pipeline?

If possible could you show me how you do this recalculation?

dmitryshendryk commented 5 years ago

@ash1995 By myself. It's not difficult, you just need to find transformation matrix after augmentation and then apply it on bounding box.

rakehsaleem commented 5 years ago

Can you show some code or script of your transformation matrix you made to apply after image augmentation? Thanks