Open heyitsguay opened 3 years ago
I was working on impelementing dynamic head these days. I found that the spatial attention is just a deformable convolution v2 (Deformable ConvNets v2: More Deformable, Better Results). I think there are L different deconvs applied on L different feature map, and then get the mean of deconv outputs on L axis.
I was working on impelementing dynamic head these days. I found that the spatial attention is just a deformable convolution v2 (Deformable ConvNets v2: More Deformable, Better Results). I think there are L different deconvs applied on L different feature map, and then get the mean of deconv outputs on L axis.
Hi, can you share with me the dynamic head code you implemented? The original code is too complicated, thanks!
In Equation 4 in your paper, you specify an equation for the spatial-aware attention module which averages across the levels of the rescaled spatial pyramid. It appears this equation would map an element of R^(LxSxC) to R^(SxC). Is this intended? How do you apply multiple DyHead modules after this stage, if the L axis is collapsed?