jacobgil / pytorch-grad-cam

Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
https://jacobgil.github.io/pytorch-gradcam-book
MIT License
10.32k stars 1.54k forks source link

How can I use grad-cam in FPN net? #31

Closed ChrisHJC closed 3 years ago

ChrisHJC commented 4 years ago

I have been using FPN network structure recently, but I have been unable to properly visualize with grad-cam.If anyone knows how to write code, please let me know.Thans a lot.

thisismexp commented 4 years ago

Hi, I had the exact same problem a few hours ago, and with some coding and debugging i got some results now. My task was to apply Grad-CAM to a Faster R-CNN model to gain some insights on the backbone of my model (ResNet-34). In order to use this nice project I needed to adjust the __call__ functions of ModelOutputs and GuidedBackpropReLUModel. I then injected my custom functions into the project and used is as normal.

Customization of the ModelOutputs.__call__() function was needed because the model of the FasterRCNN architecture contains several additional layers and transformations in comparison to e.g. the VGG architectures. To be able to use all other features of the pytorch_grad_cam package, I customized this call to find the deepest layer of the backbone network, save its activation and gradient information by calling the feature_extractor there and then continue with the normal forward pass of the FasterRCNN implementation. To do this i reassembled the original implementation in GeneralizedRCNN.forward() (which can be found in torchvision.models.detection.generalized_rcnn.py:44). And here and there got rid of tensors with too much dimension. I did this because I'am only interested in the best prediction for this image. The GuidedBackpropReLUModel.__call__() was altered in a similar fashion.

Overall this solution is a simplification, and dismisses a lot of functionality of the FasterRCNN and also reduced the object detection network to a image classification network.

I hope this helps a bit.

FraPochetti commented 3 years ago

Hi @thisismexp could you please share the code you have put together (if possible)? That'd be super useful!

hanikh commented 3 years ago

@thisismexp I am trying to implement grad-cam for faster-rcnn (detectron2 with config file faster_rcnn_X_101_32x8d_FPN_3x.yaml). I have a problem. all the elements of the cam (before Relu) are negative. I would like to ask you, why you have used guided backpropagation? do you think my probem is related to using backpropagation and not the guided one?

jacobgil commented 3 years ago

@hanikh what is the target layer / loss function you are using. Is it for classification, or bounding box regression?

hanikh commented 3 years ago

@jacobgil I am using detectron2 and config file "faster_rcnn_X_101_32x8d_FPN_3x.yaml". the target layer is "roi_heads.box_pooler" (this is the layer which pools the boxes from different level feature maps.). I calculate the gradient for the best score and NOT the bounding boxes.

thisismexp commented 3 years ago

Hi @thisismexp could you please share the code you have put together (if possible)? That'd be super useful!

Unfortunately I cannot share all the code with a broader audience, because I signed a NDA However I will send you some files via mail and hope this helps you out.

thisismexp commented 3 years ago

@hanikh honestly I have no idea, but maybe this project might help, as they are using detectron2 in their example: Grad-CAM.pytorch

hanikh commented 3 years ago

@thisismexp I have seen thatbefore. it didn't help me

jacobgil commented 3 years ago

I would like to help with this, so some kind of a code snippet to run to help reproduce will be great. Otherwise I will try reproducing myself, but it might take me a few days or more to get to it.

jacobgil commented 3 years ago

As a side note in case it helps:

In case it's the entire image - maybe it makes sense it will be negative, since most of the image is not the object. Maybe crop the activations/gradients to the bounding box location, if you aren't already doing it?

hanikh commented 3 years ago

@jacobgil may I have your email to share some parts of the code with you?

jacobgil commented 3 years ago

jacob.gildenblat@gmail.com

phuccuongngo99 commented 3 years ago

Hi, I'm working on this as well and attempted exactly what @hanikh did: setting target layer as roi_heads.box_pooler However, the result doesn't look right.

May I ask if you guys have solved it? Much appreciated. Thank you

shubham-scisar commented 2 years ago

@hanikh @jacobgil I am also trying to generate CAM's for detectron2 . But I am using a config file faster_rcnn_R_101_FPN_3x.yaml, when I try to input the image to ScoreCam or Eigen CAM function, these require input as a tensor grayscale_cam = cam(input_tensor, targets=targets) ,whereas detectron2 model requires list of dictionary as an input due to which I get an error in batched inputs of detectron2 as :

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-19-f8134a721b63> in <module>
      6                reshape_transform=fasterrcnn_reshape_transform)
      7 
----> 8 grayscale_cam = cam(input_tensor, targets=targets)
      9 # Take the first image in the batch:
     10 grayscale_cam = grayscale_cam[0, :]

~/anaconda3/envs/pytorch_latest_p36/lib/python3.6/site-packages/pytorch_grad_cam/base_cam.py in __call__(self, input_tensor, targets, aug_smooth, eigen_smooth)
    183 
    184         return self.forward(input_tensor,
--> 185                             targets, eigen_smooth)
    186 
    187     def __del__(self):

~/anaconda3/envs/pytorch_latest_p36/lib/python3.6/site-packages/pytorch_grad_cam/base_cam.py in forward(self, input_tensor, targets, eigen_smooth)
     72                                                    requires_grad=True)
     73 
---> 74         outputs = self.activations_and_grads(input_tensor)
     75         if targets is None:
     76             target_categories = np.argmax(outputs.cpu().data.numpy(), axis=-1)

~/anaconda3/envs/pytorch_latest_p36/lib/python3.6/site-packages/pytorch_grad_cam/activations_and_gradients.py in __call__(self, x)
     40         self.gradients = []
     41         self.activations = []
---> 42         return self.model(x)
     43 
     44     def release(self):

~/anaconda3/envs/pytorch_latest_p36/lib/python3.6/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

~/anaconda3/envs/pytorch_latest_p36/lib/python3.6/site-packages/detectron2/modeling/meta_arch/rcnn.py in forward(self, batched_inputs)
    144         """
    145         if not self.training:
--> 146             return self.inference(batched_inputs)
    147 
    148         images = self.preprocess_image(batched_inputs)

~/anaconda3/envs/pytorch_latest_p36/lib/python3.6/site-packages/detectron2/modeling/meta_arch/rcnn.py in inference(self, batched_inputs, detected_instances, do_postprocess)
    197         assert not self.training
    198 
--> 199         images = self.preprocess_image(batched_inputs)
    200         features = self.backbone(images.tensor)
    201 

~/anaconda3/envs/pytorch_latest_p36/lib/python3.6/site-packages/detectron2/modeling/meta_arch/rcnn.py in preprocess_image(self, batched_inputs)
    222         Normalize, pad and batch the input images.
    223         """
--> 224         images = [x["image"].to(self.device) for x in batched_inputs]
    225         images = [(x - self.pixel_mean) / self.pixel_std for x in images]
    226         images = ImageList.from_tensors(images, self.backbone.size_divisibility)

~/anaconda3/envs/pytorch_latest_p36/lib/python3.6/site-packages/detectron2/modeling/meta_arch/rcnn.py in <listcomp>(.0)
    222         Normalize, pad and batch the input images.
    223         """
--> 224         images = [x["image"].to(self.device) for x in batched_inputs]
    225         images = [(x - self.pixel_mean) / self.pixel_std for x in images]
    226         images = ImageList.from_tensors(images, self.backbone.size_divisibility)

IndexError: too many indices for tensor of dimension 2

Please help how to solve this issue.