mkocabas / CoordConv-pytorch

Pytorch implementation of CoordConv introduced in 'An intriguing failing of convolutional neural networks and the CoordConv solution' paper. (https://arxiv.org/pdf/1807.03247.pdf)
397 stars 50 forks source link

Why range is [-1,1]? #11

Open mrgloom opened 6 years ago

mrgloom commented 6 years ago

As I can see in numpy example it appends maps with values in range [-1,1], but as I understand from paper they suggest i,j int coordinates of pixels.

akanimax commented 6 years ago

Hi @mrgloom, please refer to Section 3, paragraph 3 where it is mentioned that the coordinates are normalized to the [-1, +1] range before the conv operation is performed. I believe doing so helps the network work with immensely large images of the order of 1024 x 1024 without the threat of activations' explosion (in turn gradients explosion). I hope this helps. Please let me know if you have any further queries.

Cheers! @akanimax

DpkApt commented 4 years ago

Hi @akanimax, for xx_channel you create a matrix where columns range from 0 to y_dim and that is repeated in every row. Then you divide by _xdim to convert to 0-1 range before multiplying with 2 and subtracting 1 to shift to [-1,1]. If its not the same y_dim (instead of xdim mentioned above) that you're dividing by it wouldn't be 0-1. The paper discusses only square images, so this wouldn't be a problem. Am i missing something? TIA for any help

akanimax commented 4 years ago

@DpkApt, I don't understand your question completely. here they do divide by the appropriate dimensions so that the code works for arbitrary image dimensions. Please let me know if you still have any further questions. The specific case I used was just as an example, in practice all you care about is to have the coordinates in the range [-1, 1].

Cheers :beers:! @akanimax

DpkApt commented 4 years ago

Ah you're right. I was looking at the code provided in the paper and in the official repo. The difference with the code you shared was xx_channel is formed with range(y_dim) in the official code and range(xdim) for yours. That was actually my point.