Thank author for the code. I have two small questions:
regarding this small block of code below to obtain cam:
def forward_cam(self, x):
x = super().forward(x)
x = self.fc8(x)
x = F.relu(x)
x = torch.sqrt(x)
return x
what's x=torch.sqrt(x) for?
regarding the pretrained caffe weghts, i.e. vgg16_20M.prototxt, on what dataset was it trained on? were you meant to use the deeplab_LargeFOV model of vgg16 version as the network to compute CAM, but why setting fc6_dilation=1 not 12 as in deeplab v1 paper?
VGG16_20M model was trained on ImageNet as deeplab v1 paper says.
I just wanted to adopt DeepLab in CAM and AffinityNet because it will be used for semantic segmentation anyway.
I think other questions are highly related to the above. To convert DeepLab as a CAM network, I followed https://arxiv.org/pdf/1701.08261.pdf. (which includes removing last max polling layer, adjusting dilation rate, etc.)
Then, I found that as the last maxpooling layer was removed, the varaince of the activation map has become different which can be simply compensated by x = torch.sqrt(x). (This is not neccessary when tyring with other backbone networks such as VGG-GAP https://github.com/metalbubble/CAM)
Thank author for the code. I have two small questions: