ljanyst / image-segmentation-fcn

Semantic Image Segmentation using a Fully Convolutional Neural Network in TensorFlow
http://jany.st/post/2017-06-25-semantic-image-segmentation-using-fcns.html
87 stars 32 forks source link

regarding the number of classes and data augmentation mechanism #2

Closed wenouyang closed 6 years ago

wenouyang commented 6 years ago

Hi Lukasz, thank you very much for sharing the solution. I have two questions:

1) In some ground truth images, I found some pixels are marked as black. What do those pixels represent. My understanding is that this should be a binary semantic segmentation problem, i.e., we only have two classes, background and road. Moreover, how do you handle those black pixels?

capture

2) What kind of data augmentation mechanisms that you have been using?

Thanks a lot for your response.

ljanyst commented 6 years ago

It, in fact, is a binary problem. I think some of these labels are just broken. If you're not careful it may cause numerical instabilities in the model, though. See this note from the TensorFlow docs:

NOTE: While the classes are mutually exclusive, their probabilities need not be. All that is required is that each row of labels is a valid probability distribution. If they are not, the computation of the gradient will be incorrect.

I even wrote a script to check whether you always end up with a proper class probability distribution in your labels. Just treat everything that is not purple as background.

I did not do any augumentation. The original paper on page 7 says:

Augmentation We tried augmenting the training data by randomly mirroring and “jittering” the images by translating them up to 32 pixels (the coarsest scale of prediction) in each direction. This yielded no noticeable improvement.