Regarding using this solution to solve another type of semantic segmentation problem

orobix / retina-unet

Retina blood vessel segmentation with a convolutional neural network

1.26k stars 468 forks source link

Regarding using this solution to solve another type of semantic segmentation problem #13

Closed wenouyang closed 7 years ago

wenouyang commented 7 years ago

Hi, Thanks for sharing the code. I have a general question regarding the type of semantic segmentation task you are trying to solve.

In addition to the retina blood vessel segmentation, another popular segmentation task is discussed in https://www.kaggle.com/c/ultrasound-nerve-segmentation The ground truth labelled area is usually like this one.

test

It seems to me that, for the retina study, the labelled area is ground truth image is more like a tree structure with a lot of small branches; while for this ultra-sound study, the labelled area is always a single block. Even though both of them belong to the semantic segmentation problem, the characteristics of labelled area are different with each other. My question is that, will the net architecture you designed here be a good fit for Kaggle study? Do you have any specific suggestion, which kind of modification can be made, when adopting your solution for that kaggle problem?

lantiga commented 7 years ago

Yes, the technique is very flexible. We have indeed applied it to the ultrasound nerve segmentation challenge. We actually ranked quite well but we missed the "tricks" phase to make it to the top. In that case we didn't break up the image into pieces but processed the whole thing. In the retina case we break it up into small blocks, so the segmentation in the individual block is actually quite compact. We're reasoning about specific architectures for branched structures, hopefully we'll be able to work on them in the future.

Our code for the ultrasound challenge was quite similar to https://github.com/jocicmarko/ultrasound-nerve-segmentation. To me it was surprising to see how well it performed, given that I could barely discern the nerve myself on several images.

argman commented 7 years ago

@lantiga , tks for sharing the code, I want to ask how do you choose the preprocessing in this dataset ? why do you use grey scale instead of rgb ?

lantiga commented 7 years ago

We use grayscale because retinal images (except the more cutting edge, very recent cameras) are originally grayscale and get colored after the fact. The information in there is inherently 1-channel.

As to the rest of pre-processing, the basic idea is to remove slower trends across patches.

argman commented 7 years ago

@lantiga , what do you mean by trends across patches ?

lantiga commented 7 years ago

Mostly removing low-frequency changes in contrast and normalizing the intensity locally so that each patch has similar intensity statistics wrt to the others and local changes are enhanced. See https://en.wikipedia.org/wiki/Adaptive_histogram_equalization for instance.