Question about the relation betwwen 'disp_response_interp' and 'disp_response_input'

ZSQflower commented 6 years ago

Hi, Zhang (maybe :) ), I have a question that troubles me for two days. In the tracking phase, the response heatmap is upsampled by a ratio (response_up_stride = 16), which increases the resolution of the heatmap to 272. This resolution 272 is bigger than the resolution of the input instance patch (255). And the equation 'disp_response_input = disp_response_interp config.total_stride / config.response_up_stride' is aimed to map the response location to the input instance (of network input size 255) location, but it is quite hard to understand why ' config.total_stride / config.response_up_stride' ? Maybe
*disp_response_input = disp_response_interp config.instance_size/ config.disp_response_interp** is more understandable as a linear mapping?

Thanks @StrangerZhang

StrangerZhang commented 6 years ago

Caculate the mapping according to the stride, not the input size

ZSQflower commented 6 years ago

OK, I understand that, Thanks. So, according SiamFC, the response map reflects the translation. The max translation is designed to be (17 -1) / 2 * total_stride = 64, not mapping to the full instance search patch. Hence, we suppose that the convolutional layer of stride 1 do not bring in translation? ~

zj717754140 commented 5 years ago

I still can't understand that ,can you explain more details about it?

zj717754140 commented 5 years ago

Why the backbone total stride is 8, responses_up_stride is 16?

StrangerZhang / SiamFC-PyTorch

Question about the relation betwwen 'disp_response_interp' and 'disp_response_input' #4