MarvinTeichmann / tensorflow-fcn

An Implementation of Fully Convolutional Networks in Tensorflow.
MIT License
1.1k stars 433 forks source link

regarding using this model for my own data set with only two classes #23

Closed wenouyang closed 7 years ago

wenouyang commented 7 years ago

Hi Marvin,

Thank you for sharing the code. May I ask you for some advice?

I am trying to adopting your code for one of my data set, which has about 300 images with 600*800 sizes. The masked one has about two classes, class 1 takes about 20% of the whole image. The image itself is kind of far away from the data set used to pre-train the VGG model. For instance, the bio-medical data.

There are several questions, 1) Can I still use the pre-trained VGG weights? 2) Which part of the code need I change to incorporate the two-classes case? 3) Since the image size is pretty big and the number of images is limited, I have been planning to conduct patch-wise training, i.e., creating sample patches from the original images. Should I heavily sample the areas corresponding to class 1? During the training process, do I have to put all the patches corresponding to a single image into a single batch? Or I can just use batch size=1 in the training process?

Wenouyang

MarvinTeichmann commented 7 years ago

Hi Wenouyang,

firstly, have you seen my KittiSeg repository? The KittiSeg Code shows you how to train fcn on two-classes images. This might answer Question 2. and 3. In particular 600*800 should be small enough to train as a whole. The main limitation will be GPU memory. But if you have about 12GB it should fit in the RAM. If this is however a problem, you can crop patches. The code is fully convolution, so you will not need to sample random patches during inference, even if you trained on smaller patches. Also, 300 images is more then enough if you use pretrained weights. In KittiSeg I use 200 images and in MediSeg less then 80 images.

Regarding your first Question. Most likely it will still be advisable to use pretrained weights. Most of the early layers of VGG have specialized to detect low level geometric structures like edges, blobs etc. Those features are useful in almost any task. Training for later layers,which learn high-lever features is the easiest.

Also, I did train fcn on medical data in the MediSeg project. The data is also quite different from ImageNet pretrained data. Nevertheless, using pretrained weights was very useful and we good very good results with less then 80 images. So you should be fine. Overall my code is optimized to perform well on not to much data.

Marvin

wenouyang commented 7 years ago

Hi Marvin,

Thank you very much for your reply. I will close this thread.

Once having questions after reading the MediSeg project and KittiSeg project, I may open new thread to avoid confusion here. Thanks a lot for sharing all of these.

wenouyang