matterport / Mask_RCNN

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow
Other
24.75k stars 11.71k forks source link

Training on dataset where some images are all BG class #757

Open rbavery opened 6 years ago

rbavery commented 6 years ago

Hello, I'm trying to apply mrcnn to satellite imagery. I'm inspecting the data for errors since I was getting poor training results (I've already made adjustments for the fact that my inputs have 8 channels instead of 3.) When I use the inspect data notebook on my dataset, I see that a lot of images have no objects (over 100 out of 469). But in the nucleus example, all training images have at least one object. I think an issue I'm having is that I am not correctly encoding the mask for negative samples with no objects.

Here is my data inspection notebook with example images: https://github.com/rbavery/CropMask_RCNN/blob/master/notebooks/crops/inspect_crop_data.ipynb

For each image that has no field, I have an empty mask of shape (262, 262, 1) that is all zeros, same shape as the image except for the channels. This would be different from a mask with one field which would have 1s for fields and 0s for no field. mrcnn.visualize.display_instances() works on images with at least 1 field but fails for images with no fields. the reason is that load_image_gt() in mrcnn/model.py changes the shape of the masks with no objects from (262, 262, 1) to (262, 262, 0) in this step:

    # Note that some boxes might be all zeros if the corresponding mask got cropped out.
    # and here is to filter them out
    _idx = np.sum(mask, axis=(0, 1)) > 0
    print(_idx, 'idx')
    mask = mask[:, :, _idx]
    print(mask.shape, '1275')
    class_ids = class_ids[_idx]
    # Bounding boxes. Note that some boxes might be all zeros
    # if the corresponding mask got cropped out.
    # bbox: [num_instances, (y1, x1, y2, x2)]
    bbox = utils.extract_bboxes(mask)

This then makes mask an empty array, with no zeros to indicate the background class, causing display to fail and possibly the model training to fail (still not sure if this is what causes training issues but I think it is likely)

Any suggestions on how to encode sample masks with only the background class, no objects?

mekomlusa commented 6 years ago

+1 on this. I'm working on a dataset that contains no object at all in some sample images, as they're negative examples. Your thread inspires me to consider if this package has poor support on background-only images.

Edited: I don't have training issues when training using a mixture of images with objects/pure background. But my training result was poor as well. I'm thinking if one should rule out all the pure background images instead in order to make the package to work.

rbavery commented 6 years ago

Yeah @mekomlusa I was wondering if discarding is the best way to go. If I tried that I'd lose about 100 out of ~500 samples, but interested to hear if that works out for you if you try it.

Since you're not encountering NaNs in training, I think there might be another reason my rpn bbox loss decreases but eventually turns to NaN. I used the inspect model notebook to visualize what was going on and it looks like right before rpn bbox goes to nan, there are many regions proposed that are way off from the actively farmed fields in the top corners. But I'm not getting much of an idea from the inspection why rpn bbox loss goes to nan. Any suggestions are much appreciated!

The next thing I'll try is to just train on 3 channel RGB rather than 8 channels so that I can pre-train on ImageNet instead of training from scratch. I'll also use larger inputs than ~256x256, I'm currently gridding up 8 ~2500x2500 image and can adjust the inputs to 512x512 since I'm noticing a lot of fields are partially present.

Here are my results for one field at the 20th epoch. Fails at the 21st epoch. region_proposals_classified

selection_001

Configurations:

Configurations:
BACKBONE                       resnet50
BACKBONE_STRIDES               [4, 8, 16, 32, 64]
BATCH_SIZE                     1
BBOX_STD_DEV                   [0.1 0.1 0.2 0.2]
CHANNELS_NUM                   8
COMPUTE_BACKBONE_SHAPE         None
DETECTION_MAX_INSTANCES        100
DETECTION_MIN_CONFIDENCE       0.7
DETECTION_NMS_THRESHOLD        0.3
FPN_CLASSIF_FC_LAYERS_SIZE     1024
GPU_COUNT                      1
GRADIENT_CLIP_NORM             5.0
IMAGES_PER_GPU                 1
IMAGE_MAX_DIM                  256
IMAGE_META_SIZE                14
IMAGE_MIN_DIM                  256
IMAGE_MIN_SCALE                0
IMAGE_RESIZE_MODE              pad64
IMAGE_SHAPE                    [256 256   8]
LEARNING_MOMENTUM              0.9
LEARNING_RATE                  0.001
LOSS_WEIGHTS                   {'rpn_class_loss': 1.0, 'rpn_bbox_loss': 1.0, 'mrcnn_class_loss': 1.0, 'mrcnn_bbox_loss': 1.0, 'mrcnn_mask_loss': 1.0}
MASK_POOL_SIZE                 14
MASK_SHAPE                     [28, 28]
MAX_GT_INSTANCES               28
MEAN_PIXEL                     [259.6  347.   259.8  416.3  228.23 313.4  187.5  562.9 ]
MINI_MASK_SHAPE                (56, 56)
NAME                           wv2-gridded-no-partial
NUM_CLASSES                    2
POOL_SIZE                      7
POST_NMS_ROIS_INFERENCE        1000
POST_NMS_ROIS_TRAINING         2000
ROI_POSITIVE_RATIO             0.33
RPN_ANCHOR_RATIOS              [0.5, 1, 2]
RPN_ANCHOR_SCALES              (16, 32, 64, 128, 180)
RPN_ANCHOR_STRIDE              1
RPN_BBOX_STD_DEV               [0.1 0.1 0.2 0.2]
RPN_NMS_THRESHOLD              0.7
RPN_TRAIN_ANCHORS_PER_IMAGE    256
STEPS_PER_EPOCH                1000
TOP_DOWN_PYRAMID_SIZE          256
TRAIN_BN                       False
TRAIN_ROIS_PER_IMAGE           100
USE_MINI_MASK                  False
USE_RPN_ROIS                   True
VALIDATION_STEPS               100
WEIGHT_DECAY                   0.0001
mekomlusa commented 6 years ago

@rbavery Thanks for sharing your config. What I'm afraid is that in reality, the input images I will feed to my trained model will be either pure background or with the objects of interest. If I train the model extensively without any "noise" data, I doubt that the model may fail if it's indeed a pure background with no object. But I will try (if time permits :P), though that means I'll lose half of my data (~2,000 pics in total).

As for your rpn_bbox_loss going to NaN - did you have this on other loss metrics as well? I did one time, like after epoch 5 all my losses suddenly became NaN (both training and validation). That's because my learning rate was set too high. The default parameter is 0.001 and looks like you didn't change yours as well, but for my case I have to set it lower or the whole model diverges.

mekomlusa commented 6 years ago

BTW, training extensively using only images with objects make the situation worse. My model now always "sees" stuffs while there's nothing there. Even more sensitive than before.

rbavery commented 6 years ago

Thanks for the update and help @mekomlusa. I think that the mrcnn model.py can handle negative samples just fine then. It was only the rpn_bbox_loss that went to NaN after decreasing for 20 epochs. Once rpn_bbox_loss went to NaN, all other losses went up in magnitude but did not go to NaN (except for overall loss). I'll try a lower learning rate if I end up getting the same error with a simpler case, instead of parsing fields/no fields I'm going to try circular fields vs everything else.

eljirg commented 6 years ago

Hi,

I also have same question regarding images with Background only.

I am using another dataset (car damage) in the balloon sample and it worked quite well. However, when I tried to splash (infer) on a new image it also segments another object aside from the damages (motorbike, for example. there are no motorbike at all on the annotated image)

I saw in the balloon.py, that training images without region(annotation) are automatically excluded. I am wondering if it is a possible/good idea to add images without annotation so that the model can learn that objects such as motorbike are not an object to the segment.

Have any of you tried adding the BG only images? @rbavery @mekomlusa

Thx!

mekomlusa commented 6 years ago

@eljirg Yes, previously my training examples were a mixture of background only images + true samples. It's better than the other set which consists of solely true samples. But to make it work (esp. if you have only 2 binary classes) you need to tweak the code a little bit (mainly the visualization/IoU part). I didn't use the ballon example so cannot comment more on that. Good luck!

SreenivasVRao commented 6 years ago

@mekomlusa , can you explain how to tweak the code? I see compute_overlaps_mask in utils.py

I need to compute mIOU, but this function throws errors. My mask dimensions are (H, W, 0) at times.

` if masks1.shape[0] == 0 or masks2.shape[0] == 0: return np.zeros((masks1.shape[0], masks2.shape[-1]))

flatten masks and compute their areas

masks1 = np.reshape(masks1 > .5, (-1, masks1.shape[-1])).astype(np.float32)
masks2 = np.reshape(masks2 > .5, (-1, masks2.shape[-1])).astype(np.float32)
area1 = np.sum(masks1, axis=0)
area2 = np.sum(masks2, axis=0)

`

mekomlusa commented 6 years ago

@SreenivasVRao Hey sorry for replying late. Below is what I've changed on my end in order to make it work:

    # If either set of masks is empty return empty result
    if masks1.shape[0] == 0 or masks2.shape[0] == 0:        
        return np.zeros((masks1.shape[0], masks2.shape[-1]))           
    # Added the following two checker methods below to ensure that image with no mask detected is returned correctly
    # ideas brought from https://github.com/matterport/Mask_RCNN/issues/532
    if np.sum((masks2 > .5).astype(np.uint8)) == 0:
        return np.zeros((masks1.shape[0], masks2.shape[-1]))
    if np.sum((masks1 > .5).astype(np.uint8)) == 0:
        return np.zeros((masks1.shape[0], masks2.shape[-1]))
    # flatten masks and compute their areas    
    masks1 = np.reshape(masks1 > .5, (-1, masks1.shape[-1])).astype(np.float32)
    masks2 = np.reshape(masks2 > .5, (-1, masks2.shape[-1])).astype(np.float32)
ramicetty commented 6 years ago

@mekomlusa Kindly can you suggest how to load and label negative images with their empty masks ?

pradeeprathore04 commented 6 years ago

@SreenivasVRao Hey sorry for replying late. Below is what I've changed on my end in order to make it work:

    # If either set of masks is empty return empty result
    if masks1.shape[0] == 0 or masks2.shape[0] == 0:      
        return np.zeros((masks1.shape[0], masks2.shape[-1]))         
    # Added the following two checker methods below to ensure that image with no mask detected is returned correctly
    # ideas brought from https://github.com/matterport/Mask_RCNN/issues/532
    if np.sum((masks2 > .5).astype(np.uint8)) == 0:
        return np.zeros((masks1.shape[0], masks2.shape[-1]))
    if np.sum((masks1 > .5).astype(np.uint8)) == 0:
        return np.zeros((masks1.shape[0], masks2.shape[-1]))
    # flatten masks and compute their areas      
    masks1 = np.reshape(masks1 > .5, (-1, masks1.shape[-1])).astype(np.float32)
    masks2 = np.reshape(masks2 > .5, (-1, masks2.shape[-1])).astype(np.float32)

@mekomlusa @SreenivasVRao @rbavery @eljirg I would be grateful if you can you please mention, what changes you have made in model.py to train on negative images (i.e. images which dont have any mask). I have tried the below one if config.TRAIN_ON_BG_ONLY_IMAGES:

If image dont have any object then too consider all negative images for training (config.TRAIN_ROIS_PER_IMAGE)

    negative_count = tf.cond(tf.greater(positive_count,0),
                             lambda:tf.cast(r * tf.cast(positive_count, tf.float32), tf.int32) - positive_count,
                             lambda: config.TRAIN_ROIS_PER_IMAGE)
else:
    negative_count = tf.cast(r * tf.cast(positive_count, tf.float32), tf.int32) - positive_count

But after making the changes, When I am passing images which have all the negative images, I am not seeing any loss in the console( Which means that the model is not getting trained). I have added config.TRAIN_ON_BG_ONLY_IMAGES as an extra parameter. The above code is from def detection_targets_graph(proposals, gt_class_ids, gt_boxes, gt_masks, config):

1)Do I have to make changes at some other position also to make sure that model is getting trained on background only images 2)In case model is getting trained on background only images, how can I verify that. One way is to use pretrained weights and see the loss while giving only background only images for training. But in that case I am not seeing any loss showing on the console while the model is training.

Any help is highly appreciative. Thanks a lot

SreenivasVRao commented 6 years ago

Sorry I don't have access to the codebase anymore. But I think this will help: https://github.com/matterport/Mask_RCNN/issues/532#issuecomment-400091605

I discarded NaN values.

pradeeprathore04 commented 6 years ago

@SreenivasVRao Thanks a lot for your prompt reply. I am trying the suggestion in #532

mekomlusa commented 6 years ago

Yep, @SreenivasVRao is right. I cannot access my original codebase anymore, but I believe my comment for that issue should work.

windson87 commented 5 years ago

@mekomlusa @pradeeprathore04 @SreenivasVRao @rbavery Hi , I have one further questions regarding image with pure background. in the load_image_gt function, when loading the instance masks for each image, has to return class IDs

image = dataset.load_image(image_id)
mask, class_ids = dataset.load_mask(image_id)

and in nucleani example. in the "the load_mask function:" it returns ID 1 to every instance

load_mask function (nucleni example)

Return mask, and array of class IDs of each instance. Since we have

    # one class ID, we return an array of ones, 
    return mask, np.ones([mask.shape[-1]],dtype=np.int32)

So in my case, for images with only background, I will return class ID as 0 for that example. but later on in the load_image_gt: they again make active class all to 1 (binary task case)

# Active classes
# Different datasets have different classes, so track the
# classes supported in the dataset of this image.
active_class_ids = np.zeros([dataset.num_classes], dtype=np.int32)
source_class_ids = dataset.source_class_ids[dataset.image_info[image_id]["source"]]
active_class_ids[source_class_ids] = 1

Am a bit confused, I wonder how do you guys deal with this problem. It will be great to hear some advice from you guys. Thanks ;)

smudge1872 commented 5 years ago

@pallaviroyal Did you find out how to load the masks of background images only?

ramicetty commented 5 years ago

@smudge1872 Yes. i found . I was doing it for tgs-salt-identification-challenge. I found how to prepare dataset object in case of Mask_RCNN.I checked my backup. Apologies i missed my code. But Mask_RCNN networks implementation is not meant for BG alone too. Because they have internally ROI(Region of Interest) layer.

smudge1872 commented 5 years ago

@pallaviroyal @mekomlusa So it is not possible to add background(negative) images to the training set?

mekomlusa commented 5 years ago

@smudge1872 I won't say it's not possible, I got the best result when using half pure background + half normal images (with the objects of interest). Code is available in the earlier thread.

smudge1872 commented 5 years ago

Thanks @mekomlusa . Just to verify. 1) Use the modified "compute_overlaps_masks" function in your branch, and in the load_mask function, set masks for background only images to np.empty([0, 0, 0]) and maskIds to np.empty([0], np.int32) . And this will enable training with background only images?

keineahnung2345 commented 5 years ago

You can try out this PR: https://github.com/matterport/Mask_RCNN/pull/1088.

sohinimallick commented 3 years ago

Thanks @mekomlusa . Just to verify. 1) Use the modified "compute_overlaps_masks" function in your branch, and in the load_mask function, set masks for background only images to np.empty([0, 0, 0]) and maskIds to np.empty([0], np.int32) . And this will enable training with background only images?

@smudge1872 what code edit did you do to load_mask()? I just edited the compute_overlap_masks fn and it returns an MAP of 0

sohinimallick commented 3 years ago

Thanks @mekomlusa . Just to verify. 1) Use the modified "compute_overlaps_masks" function in your branch, and in the load_mask function, set masks for background only images to np.empty([0, 0, 0]) and maskIds to np.empty([0], np.int32) . And this will enable training with background only images?

@smudge1872 what code edit did you do to load_mask()? I just edited the compute_overlap_masks fn and it returns an MAP of 0

Anyone know the answer to this?

smudge1872 commented 3 years ago

@sohinimallick I used the balloon.py example as a template to create a config class for the object I was trying to detect. I edited the load_mask function in balloon.py example. I used pixel ground truthed masks for the images that had the objects. For images that did not have the objects. I did this mask = np.empty([0, 0, 0]); maskIDs = np.empty([0], np.int32); return mask, maskIDs;