Closed m-zheng closed 1 year ago
Yes, you are right.
This implementation is modified according to nn.MultiheadAttention()
.
Since we want to implement this module using a multi-head form, we need to project a feature (C x H x W
) to N
sub-features (N x C / N x H x W
) as N
heads. Then we implement weighted feature aggregation within each head, finally aggregate the features of all heads together.
Thanks for your help. It makes perfect sense now.
Hi,
I am trying to understand the implementation of your equation 5, 6 and 7 in your paper, and would be thankful if you can help.
According to your paper, the support feature fS is flipped before being used to convolve the R. However, there is a
1x1
convolution (line 247 below) applied on the support feature fS before implementing the equation 5. Are you using the projected support feature fS for the equation 5? https://github.com/zhiyuanyou/SAFECount/blob/de067f9f1ca2caea432dd4c2e6d9ec9b2a169ebf/models/safecount.py#L238-L262Additionally, there is another
1x1
convolution (line 261) after implementing the equation 6. Are you using the projected fR to implement your equation 7?