make_score_map return image shape 1 pixel larger sometimes

jackkwok commented 5 years ago

In make_score_map(), I found cases when the input shape is different from score map shape by 1 pixel. e.g. image shape: (2000, 1359) score map shape (2000, 1360)

Can you please explain the calculation behind the offset value:

Quote from producer.py:

The offset is inserted so that the final size of the score map matches

the search image. To know more see "How to overlay the search img with

the score map" in Trello/Report. It is half of the dimension of the

Smallest Class Equivalent of the Ref image.

offset = (((self.ref.shape[0] + 1)//4)*4 - 1)//2

jackkwok commented 5 years ago

I think I found the reason. The 2D pooling layer (size=3, stride=2) doesn't have padding so the scaling factor is not necessarily precisely a whole number (2). That may explain the 1 pixel deviation.

rafellerc commented 5 years ago

Yes exactly, it is worth stressing that the lack of padding is a necessary condition for the the network to be "fully-convolutional", as it is explained in the paper.

rafellerc / Pytorch-SiamFC

make_score_map return image shape 1 pixel larger sometimes #18

The offset is inserted so that the final size of the score map matches

the search image. To know more see "How to overlay the search img with

the score map" in Trello/Report. It is half of the dimension of the

Smallest Class Equivalent of the Ref image.