qubvel / segmentation_models

Segmentation models with pretrained backbones. Keras and TensorFlow Keras.
MIT License
4.75k stars 1.03k forks source link

Dealing with false-positives for binary segmentation #269

Open vnks opened 4 years ago

vnks commented 4 years ago

Hi,

I'm using the fantastic example for binary segmentation with my own set of images, which works great for true-positives (images that do contain the objects to be segmented). However, I'm also getting some false positives, where a mask is predicted for images that do not contain the object to be segmented. I can't seem to train the model to ignore the false negatives, it's like images in the training set with an empty mask are ignored. Is there a way to provide negative feedback to the model? I tried creating a training set that only contains empty masks, expecting it to learn that all images have nothing in them, but the model just ends up predicting random noise. I'm not sure if these kinds of images get filtered out somewhere, or if negative feedback is just not possible with binary segmentation models.

Thank you!

JordanMakesMaps commented 4 years ago

Do you have some example images you could show for reference?

It might be worth looking into using weighted classes within some of the loss functions. Typically the weight for each class is based on how many times it occurs in the data set so that minority classes are "equally" accounted for. Also, some loss functions do better than others for class imbalances (I forget which ones), so it might be worth checking into that as well.

vnks commented 4 years ago

Hey,

Thanks for getting back to me so quickly! My model is identifying water bodies in satellite imagery, using 256x256 tiles. It's a binary segmentation model, so it's using a single class, with images and masks like these:

0206-10-9 0206-10-9

It's working great on these kinds of tiles, with a mix of water and non-water pixels. However, it's unable to identify tiles that either have only water, or no water at all:

0206-9-5 1

The first image has a mask with every pixel set, and the second one has an empty mask. After poking around some more, I suspect that these masks somehow get normalized to nothing or ignored - plt.imshow shows an empty mask for either one, unless I toggle a pixel or two. Since augmentations sometimes produce empty or all-set masks, and it does seem to work for training, I'm thinking it may have something to do with pre-processing, but I haven't yet been able to narrow it down.

Is the UNet binary segmentation model capable of learning from these kinds of images in the training set? Perhaps I need a different approach instead of trying to track down where the normalization is happening.

JordanMakesMaps commented 4 years ago

Is the UNet binary segmentation model capable of learning from these kinds of images in the training set?

Yes, it definitely should work.

I tried creating a training set that only contains empty masks, expecting it to learn that all images have nothing in them, but the model just ends up predicting random noise.

I would have expected it to produce just a blank mask too. But that's why I think the augmentation/pre-processing might have changed the masks' values in a way you didn't expect.

Try looking at the dtype of the masks that are going to be fed into the model during training; are they uint8, uint32, float16, float32, etc. when the image contains only water and also, no water? What are they before and after pre-processing and augmentation? And finally, are the same methods being applied to the testing data that was used on the training data?

vnks commented 4 years ago

So it looks like I managed to lower the learning rate somehow, after fixing that it does seem to be taking negative feedback now. I think my remaining issues are due to the dataset, I'll keep experimenting. Thanks for your help!

vnks commented 4 years ago

Hi,

So I think I figured out what's going on here, it appears that the functions calculating IoU and subsequently the loss functions based on them don't handle empty masks correctly. If you have an empty mask, intersecting it with anything produces an empty result, so no matter what mask is predicted the network isn't getting any feedback. This can be verified by calling the Dice or Jaccard loss functions with an empty ground truth mask - no matter what the predicted mask is, the output will be the same.

I'm not sure how to special case this correctly, I think it should either invert the masks and do the IoU in this case, or simply treat any fraction of predicted pixels as the loss. I have no experience with Keras so figuring out how to express and test this stuff properly is taking forever - I'd appreciate any pointers.

Thanks!

vnks commented 4 years ago

jaccard_distance loss function from keras_contrib ended up working for me. I'm not sure what the difference is, the logic seems to be the same on a cursory look, but the keras_contrib version produces correct values given empty ground truth masks, while the version in this project returns the same value no matter what the predicted mask is.

qubvel commented 4 years ago

if you pass target as empty mask - can be 2 situations: 1) if you predict at least one pixel with value "1" your loss will be 1. 2) if you predict absolutely all pixels as zero you will have loss - 0.

qubvel commented 4 years ago

You can increase smooth value to make loss function more smooth :) keras contrib suggest smooth=100

vnks commented 4 years ago

Yeah, I noticed that the smoothing factor for keras contrib is a few orders of magnitude higher. Maybe increase the default here as well? As it stands by default empty masks effectively do nothing, since there's no difference between predicting 1 set pixel or 99% of them.