Closed beiyan1911 closed 5 days ago
Thank you for your interest in our work. We will update with a more complete version of the TASS code next week, with the hope that it will address your issue. The current implementation of the class attention map is fairly basic.
Thank you for releasing the code of this excellent work. Chapter 3.2 (Boundary-guided Mid-level Saliency Drift Regularization) of the paper introduces: "the mid-level saliency maps of our model are generated using GradCAM [42] at three stages of the CNN backbone", but we only see the implementation of gradcamnet with one convolution layer in the code (shown below). I would like to ask if there are any special considerations here? The saliency semantics generated by GradCAM seem to have no gradient. I am a little confused about how the model uses saliency map S(x,j) to construct the loss of L^{dbs}{t}(x). I look forward to receiving your response.
ResNet.py
line 186: self.gradcam_net = nn.ModuleList([nn.Conv2d(in_c, 1, 3, padding=1) for in_c in [128,256,512]]) ... ... ... line 259: for i in range(len(intermediate_x)): line 260: intermediate_x[i] = self.gradcam_neti