Open mrgloom opened 6 years ago
I think problem for 512x512 case can be also with learning rate. Try to decrease it.
Yes, it helped.
I have switched to Adam
and with lr=0.00001
it converges.
model.compile(optimizer=Adam(lr=0.00001), loss='binary_crossentropy')
Can you elaborate on why this helps?
It's just from my experience. You can check some materials about it: https://www.kdnuggets.com/2017/11/estimating-optimal-learning-rate-deep-neural-network.html
Deep learning rule: if NN must converge but it doesn't, try to reduce learning rate. )
To simplify things let's consider VGG16 based FCN-32s (not U-net):
Code:
Architecture(IMAGE_H = 32, IMAGE_W = 32, INPUT_CHANNELS = 3, NUMBER_OF_CLASSES = 1):
VGG16 originally was developed for 224x224 image size, I have tried to train network with synthetic data with different image input size, i.e. ranging IMAGE_H, IMAGE_W from 32x32 to 512x512. Object is ellipse which always fit the image.
Sample generation code:
Obtained results(1st images is input image, 2nd image is ground truth mask, 3rd image is probability, 4th image is probability thresholded at 0.5):
32x32 It works and It's little surprising for me that network can reconstruct ellipse shape from blob with 1x1 spartial size.
64x64 For some reason at this input image size I had to train network for longer time, otherwise it converges for something like this
128x128 Works well.
256x256 Works well.
512x512 Fail, looks like network learned only to predict all zeros.
So my question is is this related to receptive field size? Or is it related to unbalanced class problem (lots of background pixels)? And how to deal with this problem?