erfannoury / SuperEdge

Supervised Edge Detection
MIT License
4 stars 2 forks source link

Layer Selection #2

Open erfannoury opened 9 years ago

erfannoury commented 9 years ago

There are three main kinds of layers in VGG Net architecture:

  1. Convolutional Layers
  2. Max-pooling layers
  3. Fully-connected layers Convolutional layers encode fine and coarse localized information in multiple semantic levels, so they should be included in the feature vector. Since max-pooling layers are non-linear down-sampled versions of their previous convolutional layers, including them in feature vector would increase redundancy. Max-pooling layers create a level of translation invariance, but this invariance will be transformed to the upper convolutional layer after max-pooling layer. Since fc layers don't have localization information, including them in the feature vector won't help with increasing localization precision and localized semantic information, however, these layers contain high-level semantic information that could be beneficial in the cases where semantic information is required.

Despite all these, In the paper "Hypercolumns for Object Segmentation and Fine-grained Localization", all of these layers have been used for creating the feature vector. However, latest papers only use layer activations from convolutional layers in obtaining the feature vector.

Thus selecting which layers to include in the final feature vector still remains an issue.

yassersouri commented 9 years ago

I think Pooling layers are not needed.

I also think that Convolutional layers are absolutely needed.

For fc layers we can do experimentation, but I think they are not needed as well. Unless you convert the fc layers to convolutional equivalents and then there might be some usecases for it.

erfannoury commented 9 years ago

I agree with you on the max-pooling and convolutional layers, but fc layers still remain for debate. Yes, fc layers can be interpreted as convolutional layers of height and width of one and depth equal to the size of fc layer. But consider this portion of the feature vector for a single pixel when feature maps are resized and stacked together. That portion of feature vector for all pixels from an image will be equal, thus it won't be a discriminative feature. And our task is a pixel classification task, and those pixels all belong to the same image, however when comparing feature vectors of pixels of the test image to training samples, that portion would be discriminative, but it won't have a proper effect on discriminating pixels of an image from each other, thus I think including fc layers in pixel-wise feature vector won't be beneficial.

yassersouri commented 9 years ago

I still think experimenting is the only way to go.

erfannoury commented 9 years ago

My current implementation doesn't bode well with fc layers, but I'll try and fix it. Then we can experiment it, though I still think using fc layers won't be beneficial.

yassersouri commented 9 years ago

Now that you have an implementation, lets try to get some results. We will work on fc layers later.

erfannoury commented 9 years ago

I will try to finish the first phase of implementation till the end of this week. I'll first use hypercolumns as pixel-wise feature vectors. In the next phase, I'll be experimenting with hyper-pyramids as pixel-wise feature vectors. There will be 320×480×200, 4224-dimensional training samples, which at first will be fed into a RandomForest, though I think of experimenting with two-, or three-hidden layer dense neural networks, too. There are some issues with training that I will mention in another issue thread.

yassersouri commented 9 years ago

The biggest issue with training will be RAM!

On Mon, May 11, 2015 at 2:08 AM, Erfan Noury notifications@github.com wrote:

I will try to finish the first phase of implementation till the end of this week. I'll first use hypercolumns as pixel-wise feature vectors. In the next phase, I'll be experimenting with hyper-pyramids as pixel-wise feature vectors. There will be 320×480×200, 4224-dimensional training samples, which at first will be fed into a RandomForest, though I think of experimenting with two-, or three-hidden layer dense neural networks, too. There are some issues with training that I will mention in another issue thread.

— Reply to this email directly or view it on GitHub https://github.com/erfannoury/SuperEdge/issues/2#issuecomment-100704259.

erfannoury commented 9 years ago

Yes! This is a huge issue!

erfannoury commented 9 years ago

Now that I'm thinking more, for semantic tasks, fc layers are needed.

yassersouri commented 9 years ago

believe me, in machine learning applications research, thinking is not the way to go!

On Mon, May 11, 2015 at 12:26 PM, Erfan Noury notifications@github.com wrote:

Now that I'm thinking more, for semantic tasks, fc layers are needed.

— Reply to this email directly or view it on GitHub https://github.com/erfannoury/SuperEdge/issues/2#issuecomment-100805314.