Open PkuRainBow opened 6 years ago
My understanding of this picture is that split this feature map (HxW) into MxN sub feature maps, and then do non-local operation separately in each sub feature map. Is that meaning?
If so, in fact in my last research work, I also use this methods (bu no upper skip connection). And I will release my code soon. And you can also implement it by yourself, using 'torch.split()' to split to feature map into MxN sub feature map and then do non-local operation.
If not so, dose it means each pixel should be the center of the kernel size? It looks like much computation... And I so busy these days, I will implement if I have time.
@AlexHex7 Thanks, In fact, I mean the latter one. Each pixel should be the center of the kernel size.
@AlexHex7 Thanks, In fact, I mean the latter one. Each pixel should be the center of the kernel size.
I wonder if you have implement it into resnet101 or resnet50 to do the action recognition work ? I have implement and I didn't get the normal result .hope to discuss with you further more if you have done so .
@sxzy you mean you added Non-local block into 3D resnet, but got bad result?
@sxzy you mean you added Non-local block into 3D resnet, but got bad result?
nope. I added into 2D resnet .and get bad result. I wonder if you have done some experiment about it, if yes,I will be looking forward for your sharing
@sxzy you mean you added Non-local block into 3D resnet, but got bad result?
I use the two-stream ,not 3D
It works just fine for me
x = self.res50.layer1(x)# [batch, 256, 96, 32]
x = self.nonLocal_1(x)
x = self.res50.layer2(x)
x = self.nonLocal_2(x)
x = self.res50.layer3[0](x)
x = self.nonLocal_3(x)
Hi, I am wondering whether have you considered to implement the general form of the non-local operator, where we compute the attention of a given kernel size for each pixel.