how the tracking ROI pooling layer could work?

feichtenhofer / Detect-Track

Code release for "Detect to Track and Track to Detect", ICCV 2017

http://www.robots.ox.ac.uk/~vgg/research/detect-track/

Other

551 stars 110 forks source link

how the tracking ROI pooling layer could work? #19

Closed LiangXu123 closed 6 years ago

LiangXu123 commented 6 years ago

thanks for your excellent job,but i got confused in some details in your paper. In your paper,the tracking ROI pooling layer is operate on the stack of {Xcorr,Xreg-t,Xreg-t+1} as far as i can see: both Xreg-t and Xreg-t+1 layer has s shape of kk4 Xcorr consist of correlation output of conv3,4,5 respectively,and the correlation output should have shape like HW(2d+1)*(2d+1) so: 1:how to concat different layer together like:Xcorr and Xreg-t 2:how to pool on the stacked feature map thank you.

Feynman27 commented 6 years ago

After the correlation layers in Conv3, Conv4, and Conv5, the (h,w) dims of the correlation layers should be the same as the (h,w) of the bbox reg feature map. At this point, you can simply concat the features along the channels dimension, apply a 1x1 convolution to reduce the number of channels to 4 n_reg_classes 7 * 7 (e.g. n_reg_classes=1 for class-agnostic bbox regression), and run a pooling operation (e.g. position-sensitive roi pooling) on that feature map.

LiangXu123 commented 6 years ago

got that now，thank you for being so patient