titu1994 / keras-coordconv

Keras implementation of CoordConv for all Convolution layers
MIT License
148 stars 33 forks source link

About architectures in paper #10

Closed mrgloom closed 5 years ago

mrgloom commented 5 years ago

Is there any intuition about usage of CoordConv layer? in paper architectures description is not detailed, i.e. should we replace all convolutions in some existing model (for example VGG16) with CoordConv layer or just first convolution?

titu1994 commented 5 years ago

CoordinateChannel adds an input feature/s for the CNN to learn from. It is equivalent to giving a RGBXY "image" as input to the CNN to learn from.

In doing so, it makes no sense to add additional XY channels to subsequent layers, as they have already learned from and extracted any useful information from that input. Therefore there is no reason at add more CoordinateChannel layers after this first one.

Think of it this way, you are not resizing and concatinating your original RGB image into subsequent layers, so why do so with the coordinate channel information?

All that being said, I see no harm (other than a slight increase in parameters, and increased training time) to add more of these layers to subsequent parts of the network. My guess is that it may not help at all, but that I cannot say with certainty.

mrgloom commented 5 years ago

If we should use it just as first layer maybe it will be more efficient to not have a custom layer, but just input RGBXY to ordinary conv2d? i.e. it should be done on batch generator side.

mrgloom commented 5 years ago

It's interesting that looks like idea of adding x,y input planes was invented before coordconv paper, in "Automatic Portrait Segmentation for Image Stylization" authors use similar trick http://xiaoyongshen.me/webpage_portrait/index.html