Open rbavery opened 6 years ago
+1 on this. I'm working on a dataset that contains no object at all in some sample images, as they're negative examples. Your thread inspires me to consider if this package has poor support on background-only images.
Edited: I don't have training issues when training using a mixture of images with objects/pure background. But my training result was poor as well. I'm thinking if one should rule out all the pure background images instead in order to make the package to work.
Yeah @mekomlusa I was wondering if discarding is the best way to go. If I tried that I'd lose about 100 out of ~500 samples, but interested to hear if that works out for you if you try it.
Since you're not encountering NaNs in training, I think there might be another reason my rpn bbox loss decreases but eventually turns to NaN. I used the inspect model notebook to visualize what was going on and it looks like right before rpn bbox goes to nan, there are many regions proposed that are way off from the actively farmed fields in the top corners. But I'm not getting much of an idea from the inspection why rpn bbox loss goes to nan. Any suggestions are much appreciated!
The next thing I'll try is to just train on 3 channel RGB rather than 8 channels so that I can pre-train on ImageNet instead of training from scratch. I'll also use larger inputs than ~256x256, I'm currently gridding up 8 ~2500x2500 image and can adjust the inputs to 512x512 since I'm noticing a lot of fields are partially present.
Here are my results for one field at the 20th epoch. Fails at the 21st epoch.
Configurations:
Configurations:
BACKBONE resnet50
BACKBONE_STRIDES [4, 8, 16, 32, 64]
BATCH_SIZE 1
BBOX_STD_DEV [0.1 0.1 0.2 0.2]
CHANNELS_NUM 8
COMPUTE_BACKBONE_SHAPE None
DETECTION_MAX_INSTANCES 100
DETECTION_MIN_CONFIDENCE 0.7
DETECTION_NMS_THRESHOLD 0.3
FPN_CLASSIF_FC_LAYERS_SIZE 1024
GPU_COUNT 1
GRADIENT_CLIP_NORM 5.0
IMAGES_PER_GPU 1
IMAGE_MAX_DIM 256
IMAGE_META_SIZE 14
IMAGE_MIN_DIM 256
IMAGE_MIN_SCALE 0
IMAGE_RESIZE_MODE pad64
IMAGE_SHAPE [256 256 8]
LEARNING_MOMENTUM 0.9
LEARNING_RATE 0.001
LOSS_WEIGHTS {'rpn_class_loss': 1.0, 'rpn_bbox_loss': 1.0, 'mrcnn_class_loss': 1.0, 'mrcnn_bbox_loss': 1.0, 'mrcnn_mask_loss': 1.0}
MASK_POOL_SIZE 14
MASK_SHAPE [28, 28]
MAX_GT_INSTANCES 28
MEAN_PIXEL [259.6 347. 259.8 416.3 228.23 313.4 187.5 562.9 ]
MINI_MASK_SHAPE (56, 56)
NAME wv2-gridded-no-partial
NUM_CLASSES 2
POOL_SIZE 7
POST_NMS_ROIS_INFERENCE 1000
POST_NMS_ROIS_TRAINING 2000
ROI_POSITIVE_RATIO 0.33
RPN_ANCHOR_RATIOS [0.5, 1, 2]
RPN_ANCHOR_SCALES (16, 32, 64, 128, 180)
RPN_ANCHOR_STRIDE 1
RPN_BBOX_STD_DEV [0.1 0.1 0.2 0.2]
RPN_NMS_THRESHOLD 0.7
RPN_TRAIN_ANCHORS_PER_IMAGE 256
STEPS_PER_EPOCH 1000
TOP_DOWN_PYRAMID_SIZE 256
TRAIN_BN False
TRAIN_ROIS_PER_IMAGE 100
USE_MINI_MASK False
USE_RPN_ROIS True
VALIDATION_STEPS 100
WEIGHT_DECAY 0.0001
@rbavery Thanks for sharing your config. What I'm afraid is that in reality, the input images I will feed to my trained model will be either pure background or with the objects of interest. If I train the model extensively without any "noise" data, I doubt that the model may fail if it's indeed a pure background with no object. But I will try (if time permits :P), though that means I'll lose half of my data (~2,000 pics in total).
As for your rpn_bbox_loss
going to NaN - did you have this on other loss metrics as well? I did one time, like after epoch 5 all my losses suddenly became NaN (both training and validation). That's because my learning rate was set too high. The default parameter is 0.001 and looks like you didn't change yours as well, but for my case I have to set it lower or the whole model diverges.
BTW, training extensively using only images with objects make the situation worse. My model now always "sees" stuffs while there's nothing there. Even more sensitive than before.
Thanks for the update and help @mekomlusa. I think that the mrcnn model.py can handle negative samples just fine then. It was only the rpn_bbox_loss that went to NaN after decreasing for 20 epochs. Once rpn_bbox_loss went to NaN, all other losses went up in magnitude but did not go to NaN (except for overall loss). I'll try a lower learning rate if I end up getting the same error with a simpler case, instead of parsing fields/no fields I'm going to try circular fields vs everything else.
Hi,
I also have same question regarding images with Background only.
I am using another dataset (car damage) in the balloon sample and it worked quite well. However, when I tried to splash (infer) on a new image it also segments another object aside from the damages (motorbike, for example. there are no motorbike at all on the annotated image)
I saw in the balloon.py, that training images without region(annotation) are automatically excluded. I am wondering if it is a possible/good idea to add images without annotation so that the model can learn that objects such as motorbike are not an object to the segment.
Have any of you tried adding the BG only images? @rbavery @mekomlusa
Thx!
@eljirg Yes, previously my training examples were a mixture of background only images + true samples. It's better than the other set which consists of solely true samples. But to make it work (esp. if you have only 2 binary classes) you need to tweak the code a little bit (mainly the visualization/IoU part). I didn't use the ballon example so cannot comment more on that. Good luck!
@mekomlusa , can you explain how to tweak the code? I see compute_overlaps_mask in utils.py
I need to compute mIOU, but this function throws errors. My mask dimensions are (H, W, 0) at times.
` if masks1.shape[0] == 0 or masks2.shape[0] == 0: return np.zeros((masks1.shape[0], masks2.shape[-1]))
masks1 = np.reshape(masks1 > .5, (-1, masks1.shape[-1])).astype(np.float32)
masks2 = np.reshape(masks2 > .5, (-1, masks2.shape[-1])).astype(np.float32)
area1 = np.sum(masks1, axis=0)
area2 = np.sum(masks2, axis=0)
`
@SreenivasVRao Hey sorry for replying late. Below is what I've changed on my end in order to make it work:
# If either set of masks is empty return empty result
if masks1.shape[0] == 0 or masks2.shape[0] == 0:
return np.zeros((masks1.shape[0], masks2.shape[-1]))
# Added the following two checker methods below to ensure that image with no mask detected is returned correctly
# ideas brought from https://github.com/matterport/Mask_RCNN/issues/532
if np.sum((masks2 > .5).astype(np.uint8)) == 0:
return np.zeros((masks1.shape[0], masks2.shape[-1]))
if np.sum((masks1 > .5).astype(np.uint8)) == 0:
return np.zeros((masks1.shape[0], masks2.shape[-1]))
# flatten masks and compute their areas
masks1 = np.reshape(masks1 > .5, (-1, masks1.shape[-1])).astype(np.float32)
masks2 = np.reshape(masks2 > .5, (-1, masks2.shape[-1])).astype(np.float32)
@mekomlusa Kindly can you suggest how to load and label negative images with their empty masks ?
@SreenivasVRao Hey sorry for replying late. Below is what I've changed on my end in order to make it work:
# If either set of masks is empty return empty result if masks1.shape[0] == 0 or masks2.shape[0] == 0: return np.zeros((masks1.shape[0], masks2.shape[-1])) # Added the following two checker methods below to ensure that image with no mask detected is returned correctly # ideas brought from https://github.com/matterport/Mask_RCNN/issues/532 if np.sum((masks2 > .5).astype(np.uint8)) == 0: return np.zeros((masks1.shape[0], masks2.shape[-1])) if np.sum((masks1 > .5).astype(np.uint8)) == 0: return np.zeros((masks1.shape[0], masks2.shape[-1])) # flatten masks and compute their areas masks1 = np.reshape(masks1 > .5, (-1, masks1.shape[-1])).astype(np.float32) masks2 = np.reshape(masks2 > .5, (-1, masks2.shape[-1])).astype(np.float32)
@mekomlusa @SreenivasVRao @rbavery @eljirg I would be grateful if you can you please mention, what changes you have made in model.py to train on negative images (i.e. images which dont have any mask). I have tried the below one if config.TRAIN_ON_BG_ONLY_IMAGES:
negative_count = tf.cond(tf.greater(positive_count,0),
lambda:tf.cast(r * tf.cast(positive_count, tf.float32), tf.int32) - positive_count,
lambda: config.TRAIN_ROIS_PER_IMAGE)
else:
negative_count = tf.cast(r * tf.cast(positive_count, tf.float32), tf.int32) - positive_count
But after making the changes, When I am passing images which have all the negative images, I am not seeing any loss in the console( Which means that the model is not getting trained). I have added config.TRAIN_ON_BG_ONLY_IMAGES as an extra parameter. The above code is from def detection_targets_graph(proposals, gt_class_ids, gt_boxes, gt_masks, config):
1)Do I have to make changes at some other position also to make sure that model is getting trained on background only images 2)In case model is getting trained on background only images, how can I verify that. One way is to use pretrained weights and see the loss while giving only background only images for training. But in that case I am not seeing any loss showing on the console while the model is training.
Any help is highly appreciative. Thanks a lot
Sorry I don't have access to the codebase anymore. But I think this will help: https://github.com/matterport/Mask_RCNN/issues/532#issuecomment-400091605
I discarded NaN values.
@SreenivasVRao Thanks a lot for your prompt reply. I am trying the suggestion in #532
Yep, @SreenivasVRao is right. I cannot access my original codebase anymore, but I believe my comment for that issue should work.
image = dataset.load_image(image_id)
mask, class_ids = dataset.load_mask(image_id)
load_mask function (nucleni example)
# one class ID, we return an array of ones,
return mask, np.ones([mask.shape[-1]],dtype=np.int32)
# Active classes
# Different datasets have different classes, so track the
# classes supported in the dataset of this image.
active_class_ids = np.zeros([dataset.num_classes], dtype=np.int32)
source_class_ids = dataset.source_class_ids[dataset.image_info[image_id]["source"]]
active_class_ids[source_class_ids] = 1
Am a bit confused, I wonder how do you guys deal with this problem. It will be great to hear some advice from you guys. Thanks ;)
@pallaviroyal Did you find out how to load the masks of background images only?
@smudge1872 Yes. i found . I was doing it for tgs-salt-identification-challenge. I found how to prepare dataset object in case of Mask_RCNN.I checked my backup. Apologies i missed my code. But Mask_RCNN networks implementation is not meant for BG alone too. Because they have internally ROI(Region of Interest) layer.
@pallaviroyal @mekomlusa So it is not possible to add background(negative) images to the training set?
@smudge1872 I won't say it's not possible, I got the best result when using half pure background + half normal images (with the objects of interest). Code is available in the earlier thread.
Thanks @mekomlusa . Just to verify. 1) Use the modified "compute_overlaps_masks" function in your branch, and in the load_mask function, set masks for background only images to np.empty([0, 0, 0]) and maskIds to np.empty([0], np.int32) . And this will enable training with background only images?
You can try out this PR: https://github.com/matterport/Mask_RCNN/pull/1088.
Thanks @mekomlusa . Just to verify. 1) Use the modified "compute_overlaps_masks" function in your branch, and in the load_mask function, set masks for background only images to np.empty([0, 0, 0]) and maskIds to np.empty([0], np.int32) . And this will enable training with background only images?
@smudge1872 what code edit did you do to load_mask()? I just edited the compute_overlap_masks fn and it returns an MAP of 0
Thanks @mekomlusa . Just to verify. 1) Use the modified "compute_overlaps_masks" function in your branch, and in the load_mask function, set masks for background only images to np.empty([0, 0, 0]) and maskIds to np.empty([0], np.int32) . And this will enable training with background only images?
@smudge1872 what code edit did you do to load_mask()? I just edited the compute_overlap_masks fn and it returns an MAP of 0
Anyone know the answer to this?
@sohinimallick I used the balloon.py example as a template to create a config class for the object I was trying to detect. I edited the load_mask function in balloon.py example. I used pixel ground truthed masks for the images that had the objects. For images that did not have the objects. I did this
mask = np.empty([0, 0, 0]); maskIDs = np.empty([0], np.int32); return mask, maskIDs;
Hello, I'm trying to apply mrcnn to satellite imagery. I'm inspecting the data for errors since I was getting poor training results (I've already made adjustments for the fact that my inputs have 8 channels instead of 3.) When I use the inspect data notebook on my dataset, I see that a lot of images have no objects (over 100 out of 469). But in the nucleus example, all training images have at least one object. I think an issue I'm having is that I am not correctly encoding the mask for negative samples with no objects.
Here is my data inspection notebook with example images: https://github.com/rbavery/CropMask_RCNN/blob/master/notebooks/crops/inspect_crop_data.ipynb
For each image that has no field, I have an empty mask of shape (262, 262, 1) that is all zeros, same shape as the image except for the channels. This would be different from a mask with one field which would have 1s for fields and 0s for no field.
mrcnn.visualize.display_instances()
works on images with at least 1 field but fails for images with no fields. the reason is thatload_image_gt()
in mrcnn/model.py changes the shape of the masks with no objects from (262, 262, 1) to (262, 262, 0) in this step:This then makes mask an empty array, with no zeros to indicate the background class, causing display to fail and possibly the model training to fail (still not sure if this is what causes training issues but I think it is likely)
Any suggestions on how to encode sample masks with only the background class, no objects?