alexgkendall / caffe-segnet

Implementation of SegNet: A Deep Convolutional Encoder-Decoder Architecture for Semantic Pixel-Wise Labelling
http://mi.eng.cam.ac.uk/projects/segnet/
Other
1.08k stars 452 forks source link

strange results with four-channel images #24

Closed jorgyz closed 8 years ago

jorgyz commented 8 years ago

First, thanks for sharing your code. I'm running into a strange issue that I'm hoping Alex or someone else might be able to offer some insight on.

I'm training a model on a dataset with 3 classes and 4 channels (RGB + near-infrared). For some reason, at inference time, the model is only labeling pixels as class one or two, but it is not labeling any pixels as class 3. If I remove the near infrared channel from the data and retrain, then the model behaves as expected, labeling pixels of all three classes. I'm training the model for at least 100 epochs and I've tried it both with and without median class balancing. The strange thing is, if I train on the four-channel data for a small number of epochs (about 10) then it does label pixels of all three class types (albeit the predictions are fairly noisy still); however, when trying one trained for 20 or more epochs, it is no longer classifying anything as class 3.

I have triple-checked the prototxt files and my script for encoding the four-channel images into lmdb, but I don't see any obvious issues. Any insight about what might be causing this and how to fix it? Thanks!

alexgkendall commented 8 years ago

Hey - what is the range of your RGB input and near infrared input? Do you do any normalization to balance them (ie. scale the range of all four channels to -1 < x < 1)?

jorgyz commented 8 years ago

Thanks for your response, @alexgkendall. All of the pixel values range from 0-255. I have not applied any scaling to the values. Do you think that would help?

Pepslee commented 8 years ago

Can you explain, how can i process 4-channel images ? In function ReadImageToCVMat there are only two flags: CV_LOAD_IMAGE_COLOR, CV_LOAD_IMAGE_GRAYSCALE. What i need to remake? Only CV_LOAD_IMAGE_COLOR to CV_LOAD_IMAGE_UNCHANGED, or something else ?

jorgyz commented 8 years ago

@pepslee I'm using Python. I used PIL to open my image, which is a 4 channel tiff. For example, from PIL import Image im=Image.open(filename) imarr=numpy.array(im, dtype=numpy.uint8)

From there you can save your images into an lmdb file. There are many examples floating around that show how to do that.

Pepslee commented 8 years ago

I'm using C++, and to open the image I'm using OpenCV, it can also open 4-channel tiff. But i have some strange result, if I train on the four-channel data for a small number of epochs (< 10) I have normal result(albeit the predictions are fairly noisy still); however, when trying one trained for 20 or more epochs, vertical black lines begin to appear on the entire result mask. How did you solve your problem with class 3 ?

to_git

black - 1 class green - 2 class

But there are strange vertical lines

The problem begin to appear with incrementation of number of epochs.

jorgyz commented 8 years ago

Try using a smaller learning rate. That seemed to help for me.