mzaradzki / neuralnets

Deep Learning libraries tested on images and time series
MIT License
162 stars 100 forks source link

Some questions about the FCN code #2

Closed CreatCodeBuild closed 7 years ago

CreatCodeBuild commented 7 years ago

Thank you for writing this.

I have several questions about https://github.com/mzaradzki/neuralnets/tree/master/vgg_segmentation_keras/utils.py

First, Is there a specific reason for you to only define fcn32s and 2 functions to convert 32s to 16s and 8s? Is there any specific difficulties on defining 16s or 8s directly?

Second, why do you name the function fcn32_blank? Does it mean that this model is "blank" in terms of that it can not be used directly?

Thank you very much

mzaradzki commented 7 years ago

hi @CreatCodeBuild

The main reason I wrote the function into separated steps is mostly that I wanted to compare the relative performance (as in the jupyter notebook, at the end) as the different model configuration.

Also, the model is not so easy to build from the original paper, I mean the architecture when you are not an expert in this is not crystal clear. so to get to my goal I actually iterated from the simplest one to the finest one.

The name "_blank" indeed indicates that it is a "utility/intermediate" function to build more accurate model. The thing is that it is not simply a sequence : FCN32=>FCN16=>FCN8. That is why you have the 2 functions "32_blank to 16" and "32_blank to 8" in my code, rather than a sequential composition. The main reason it is this way is because of the dimension/resolution that changes. Once you are at the end of 32 (not 32_blank) you cannot go directly to 8 or 16. If it helps you can compare the diagrams for 16 and 8 in the blog : https://medium.com/@m.zaradzki/image-segmentation-with-neural-net-d5094d571b1e This will help you see the part that is common to the 2 models and that I "factorized" into 32_blank.

CreatCodeBuild commented 7 years ago

I spent some time to search how exactly "deconvolution" is computed. The so-called "deconvolution" is not an actual mathematical deconvolution. It is actually a transposed convolution operation, which is just normal convolution computation with some tricks. That I finally understand how strides(subsampling), kernel size affects the final result's size.

I found this paper useful for me: http://www.matthewzeiler.com/pubs/cvpr2010/cvpr2010.pdf Indeed as you said, the FCN for semantic segmentation paper is not that clear for beginners.

More surprisingly, the "convolutional layer" actually does correlation instead of convolution if we use the computer vision terms.

After carefully studying your code and your medium post, I understood why are you writing your code this way. It makes a lot of sense.

Since I am using Keras 2, I will need to change the code a little bit. (Keras just loves to break API)

I wonder what you mean by "I actually iterated from the simplest one to the finest one". If you are to load weights from the matlab toolkit, don't you have to define the network exactly the same as matlab toolkit does?

mzaradzki commented 7 years ago

Keras 2 : indeed I wrote most of my Keras scripts using Keras 1 and migrating my code has proven more time consuming that I would have liked to the FCN is still Keras 1. If you have time to explain me when you are done with Keras 2 what needs to be modified that would be great. If you dont time I fully understand of course.

About your question :

If you are to load weights from the matlab toolkit, don't you have to define the network exactly the same as matlab toolkit does?

The Matlab files for weights are available for both 16s and 8s versions, so I first coded 16s using these weights and then I implemented 8s with the other weight file. Actually I found that the .mat files from vlfeat (http://www.vlfeat.org/matconvnet/pretrained/) are also useful to help clarifying the model structure as you can read the shapes of all the weights to figure out the shape of your layers !

One last thing, that may help you a lot with Keras 2, Andrew Hundt posted this link in another chat yesterday : https://github.com/farizrahman4u/keras-contrib/blob/master/keras_contrib/applications/densenet.py#L175