jacobgil / pytorch-grad-cam

Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
https://jacobgil.github.io/pytorch-gradcam-book
MIT License
10.06k stars 1.52k forks source link

create targets for object detection #281

Closed zhangxiwensjtu closed 2 years ago

zhangxiwensjtu commented 2 years ago

Hi, I am trying to implement EigenCam for my Swin_based_Cascade_RCNN network. I find that in the Faster RCNN tutorial, we set

targets = [FasterRCNNBoxScoreTarget()]

to specify what we are interested in to show on cam. However, when I dive into the code, I am confused that it seems targets are never used. For example, when I run:

grayscale_cam = cam(data, targets=targets)

the code goes to the __call__ function, which calls the forward() function, and then

cam_per_layer = self.compute_cam_per_layer(input_tensor,
                                                   targets,
                                                   eigen_smooth)

and then it goes to:

cam = self.get_cam_image(input_tensor,
                                     target_layer,
                                     targets,
                                     layer_activations,
                                     layer_grads,
                                     eigen_smooth)

However, this returns to:

    def get_cam_image(self,
                      input_tensor,
                      target_layer,
                      target_category,
                      activations,
                      grads,
                      eigen_smooth):
        return get_2d_projection(activations)

which never use the parameter targets(which is target_category here).

Besides, I am confused why targets is a list of nn.module(List[torch.nn.Module]) in the function annotation:

class BaseCAM:
    def __init__(self,
                 model: torch.nn.Module,
                 target_layers: List[torch.nn.Module],
                 use_cuda: bool = False,
                 reshape_transform: Callable = None,
                 compute_input_gradient: bool = False,
                 uses_gradients: bool = True) -> None:

shouldn't it be a custom class?

jacobgil commented 2 years ago

targets is a list of custom modules that output a score function we want the CAM to explain, like the category (or in the case of object detection, maybe the IOU etc). Since they get an image, use the model, and then output a scalar, I thought it made sense the type annotation will be torch.nn.Module, but maybe it should actually be a "Callable".

However in the case of EigenCAM, it doesn't use targets. EigenCAM doesn't compute the CAM to optimize some property (for example explaining a specific category) -> it doesn't have class specific explainability. Therefore it's not used at all in EigenCAM. But it will be used in the other methods that have class specific explainability, like AblationCAM (in the object detection tutorials) or gradcam.

I'm not sure if the code still expects a list of targets, maybe it will work if it's empty. Would have to check. But in any case for EigenCAM it doesn't use it.