jacobgil / pytorch-grad-cam

Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
https://jacobgil.github.io/pytorch-gradcam-book
MIT License
10.69k stars 1.57k forks source link

Gradients and activations shape in the VIT example code #511

Open lunaryan opened 5 months ago

lunaryan commented 5 months ago

I tried to run the vit example code, however, I noticed the following errors.

Python 3.9.18 (main, Sep 11 2023, 13:41:44) [GCC 11.2.0] :: Anaconda, Inc. on linux Type "help", "copyright", "credits" or "license" for more information. >>> from pytorch_grad_cam import GradCAM >>> import torch >>> model = torch.hub.load('facebookresearch/deit:main', ... 'deit_tiny_patch16_224', pretrained=True) Using cache found in /home/.cache/torch/hub/facebookresearch_deit_main >>> target_layers = [model.blocks[-1].norm1] >>> image = torch.rand(1,3,224,224) >>> cam = GradCAM(model=model, target_layers=target_layers) >>> grayscale_cam = cam(input_tensor=image, targets=None) (1, 197, 192) Traceback (most recent call last): File "", line 1, in File "/data4/user/miniconda3/envs/anti-dreambooth/lib/python3.9/site-packages/pytorch_grad_cam/base_cam.py", line 188, in call return self.forward(input_tensor, targets, eigen_smooth) File "/data4/user/miniconda3/envs/anti-dreambooth/lib/python3.9/site-packages/pytorch_grad_cam/base_cam.py", line 112, in forward cam_per_layer = self.compute_cam_per_layer(input_tensor, targets, eigen_smooth) File "/data4/user/miniconda3/envs/anti-dreambooth/lib/python3.9/site-packages/pytorch_grad_cam/base_cam.py", line 143, in compute_cam_per_layer cam = self.get_cam_image(input_tensor, target_layer, targets, layer_activations, layer_grads, eigen_smooth) File "/data4/user/miniconda3/envs/anti-dreambooth/lib/python3.9/site-packages/pytorch_grad_cam/base_cam.py", line 66, in get_cam_image weights = self.get_cam_weights(input_tensor, target_layer, targets, activations, grads) File "/data4/user/miniconda3/envs/anti-dreambooth/lib/python3.9/site-packages/pytorch_grad_cam/grad_cam.py", line 32, in get_cam_weights raise ValueError("Invalid grads shape." ValueError: Invalid grads shape.Shape of grads should be 4 (2D image) or 5 (3D image).

If I comment out the error throwing logic in grad_cam.py, then the shape check in https://github.com/jacobgil/pytorch-grad-cam/blob/1ff3f58818baa2889f3f51d0b9759783b4333ba0/pytorch_grad_cam/base_cam.py#L74 also fails.

Does the shape really matter? Is there a way to fix this or just work around it? Thanks!

pytorch '2.1.0+cu121', grad-cam 1.5.2, ubuntu 18.04

huanhuanyuan7 commented 5 months ago

Did you resolve this issue? I have a similar problem: Traceback (most recent call last): File "/root/.pycharm_helpers/pydev/pydevd.py", line 1496, in _exec pydev_imports.execfile(file, globals, locals) # execute the script File "/root/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "/workspace/add-lora-ssf-augfeature/grad-CAM/UIA-ViT-main/grad_cam_visualization.py", line 183, in main(args) File "/workspace/add-lora-ssf-augfeature/grad-CAM/UIA-ViT-main/grad_cam_visualization.py", line 159, in main targets=target_category) File "/workspace/add-lora-ssf-augfeature/grad-CAM/UIA-ViT-main/pytorch_grad_cam/base_cam.py", line 186, in call return self.forward(input_tensor, targets, eigen_smooth) File "/workspace/add-lora-ssf-augfeature/grad-CAM/UIA-ViT-main/pytorch_grad_cam/base_cam.py", line 110, in forward cam_per_layer = self.compute_cam_per_layer(input_tensor, targets, eigen_smooth) File "/workspace/add-lora-ssf-augfeature/grad-CAM/UIA-ViT-main/pytorch_grad_cam/base_cam.py", line 141, in compute_cam_per_layer cam = self.get_cam_image(input_tensor, target_layer, targets, layer_activations, layer_grads, eigen_smooth) File "/workspace/add-lora-ssf-augfeature/grad-CAM/UIA-ViT-main/pytorch_grad_cam/base_cam.py", line 66, in get_cam_image weights = self.get_cam_weights(input_tensor, target_layer, targets, activations, grads) File "/workspace/add-lora-ssf-augfeature/grad-CAM/UIA-ViT-main/pytorch_grad_cam/grad_cam.py", line 23, in get_cam_weights if len(grads.shape) == 4: AttributeError: 'NoneType' object has no attribute 'shape' python-BaseException

loiqy commented 5 months ago

Did you resolve this issue? I have a similar problem: Traceback (most recent call last): File "/root/.pycharm_helpers/pydev/pydevd.py", line 1496, in _exec pydev_imports.execfile(file, globals, locals) # execute the script File "/root/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "/workspace/add-lora-ssf-augfeature/grad-CAM/UIA-ViT-main/grad_cam_visualization.py", line 183, in main(args) File "/workspace/add-lora-ssf-augfeature/grad-CAM/UIA-ViT-main/grad_cam_visualization.py", line 159, in main targets=target_category) File "/workspace/add-lora-ssf-augfeature/grad-CAM/UIA-ViT-main/pytorch_grad_cam/base_cam.py", line 186, in call return self.forward(input_tensor, targets, eigen_smooth) File "/workspace/add-lora-ssf-augfeature/grad-CAM/UIA-ViT-main/pytorch_grad_cam/base_cam.py", line 110, in forward cam_per_layer = self.compute_cam_per_layer(input_tensor, targets, eigen_smooth) File "/workspace/add-lora-ssf-augfeature/grad-CAM/UIA-ViT-main/pytorch_grad_cam/base_cam.py", line 141, in compute_cam_per_layer cam = self.get_cam_image(input_tensor, target_layer, targets, layer_activations, layer_grads, eigen_smooth) File "/workspace/add-lora-ssf-augfeature/grad-CAM/UIA-ViT-main/pytorch_grad_cam/base_cam.py", line 66, in get_cam_image weights = self.get_cam_weights(input_tensor, target_layer, targets, activations, grads) File "/workspace/add-lora-ssf-augfeature/grad-CAM/UIA-ViT-main/pytorch_grad_cam/grad_cam.py", line 23, in get_cam_weights if len(grads.shape) == 4: AttributeError: 'NoneType' object has no attribute 'shape' python-BaseException

I have a similar problem, too. I tried to run it for DINOv2:

  File "/code/finetune-v2/visualize_gradcam.py", line 208, in gradcam_dataset
    gradcam_one_img(img_path)
  File "/code/finetune-v2/visualize_gradcam.py", line 152, in gradcam_one_img
    grayscale_cam = cam(input_tensor=img_tensor, targets=targets)
  File "/opt/conda/envs/dinov2/lib/python3.9/site-packages/pytorch_grad_cam/base_cam.py", line 186, in __call__
    return self.forward(input_tensor, targets, eigen_smooth)
  File "/opt/conda/envs/dinov2/lib/python3.9/site-packages/pytorch_grad_cam/base_cam.py", line 110, in forward
    cam_per_layer = self.compute_cam_per_layer(input_tensor, targets, eigen_smooth)
  File "/opt/conda/envs/dinov2/lib/python3.9/site-packages/pytorch_grad_cam/base_cam.py", line 141, in compute_cam_per_layer
    cam = self.get_cam_image(input_tensor, target_layer, targets, layer_activations, layer_grads, eigen_smooth)
  File "/opt/conda/envs/dinov2/lib/python3.9/site-packages/pytorch_grad_cam/base_cam.py", line 66, in get_cam_image
    weights = self.get_cam_weights(input_tensor, target_layer, targets, activations, grads)
  File "/opt/conda/envs/dinov2/lib/python3.9/site-packages/pytorch_grad_cam/grad_cam.py", line 23, in get_cam_weights
    if len(grads.shape) == 4:
AttributeError: 'NoneType' object has no attribute 'shape'

here is my code:

class Dino(nn.Module):
    def __init__(self, type, img_size, cls_num):
        super().__init__()
        # get feature model
        model = torch.hub.load(
            '', type, source='local'
        ).to(device)
        autocast_ctx = partial(
            torch.cuda.amp.autocast, enabled=True, dtype=torch.float16
        )
        self.feature_model = ModelWithIntermediateLayers(
            model, n_last_blocks=1, autocast_ctx=autocast_ctx
        ).to(device)

        with torch.no_grad():
            sample_input = torch.randn(1, 3, *img_size).to(device)
            sample_output = self.feature_model(sample_input)

        # get linear readout
        out_dim = create_linear_input(
            sample_output, use_n_blocks=1, use_avgpool=True
        ).shape[1]
        self.classifier = LinearClassifier(
            out_dim, use_n_blocks=1, use_avgpool=True, num_classes=cls_num
        ).to(device)

    def forward(self, x):
        x = self.feature_model(x)
        x = self.classifier(x)
        return x

data_transforms = {
    'train': transforms.Compose([
        transforms.Resize(args.image_size),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    'val': transforms.Compose([
        transforms.Resize(args.image_size),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
}

model = Dino(args.arch, args.image_size, 2)
# print(model)
checkpoint_path = f'/code/finetune-v2/best_models/best_model_{model_name}.pth'
checkpoint = torch.load(checkpoint_path, map_location='cpu')
model.load_state_dict(checkpoint)
model.eval()
model = model.cuda()

target_layers = [model.feature_model.feature_model.blocks[-1].norm1]
# target_layers = [model.feature_model.feature_model.blocks[10]]
print(target_layers[0])

targets = [ClassifierOutputTarget(0)]

def reshape_transform(tensor):
    result = tensor[:, 1:, :].reshape(tensor.size(0),
                                      args.image_size[0] // 14, args.image_size[1] // 14, tensor.size(2))

    # Bring the channels to the first dimension,
    # like in CNNs.
    result = result.transpose(2, 3).transpose(1, 2)
    return result

def gradcam_one_img(img_path):
    with GradCAM(model=model, target_layers=target_layers, reshape_transform=reshape_transform) as cam:
        rgb_img = Image.open(img_path).convert('RGB')
        img_tensor = data_transforms['val'](rgb_img).unsqueeze(0).cuda()
        print(img_tensor.shape)
#         outputs = model(img_tensor)
        grayscale_cam = cam(input_tensor=img_tensor, targets=targets)
#         grayscale_cam = cam(input_tensor=img_tensor, targets=targets)
        grayscale_cam = grayscale_cam[0, :]
        visualization = show_cam_on_image(rgb_img, grayscale_cam, use_rgb=True)
        print(visualization)
        return cam.outputs, visualization

gradcam_one_img(img_path)
zermatt-luo commented 4 months ago

I tried to run the vit example code, however, I noticed the following errors.

Python 3.9.18 (main, Sep 11 2023, 13:41:44) [GCC 11.2.0] :: Anaconda, Inc. on linux Type "help", "copyright", "credits" or "license" for more information. >>> from pytorch_grad_cam import GradCAM >>> import torch >>> model = torch.hub.load('facebookresearch/deit:main', ... 'deit_tiny_patch16_224', pretrained=True) Using cache found in /home/.cache/torch/hub/facebookresearch_deit_main >>> target_layers = [model.blocks[-1].norm1] >>> image = torch.rand(1,3,224,224) >>> cam = GradCAM(model=model, target_layers=target_layers) >>> grayscale_cam = cam(input_tensor=image, targets=None) (1, 197, 192) Traceback (most recent call last): File "", line 1, in File "/data4/user/miniconda3/envs/anti-dreambooth/lib/python3.9/site-packages/pytorch_grad_cam/base_cam.py", line 188, in call return self.forward(input_tensor, targets, eigen_smooth) File "/data4/user/miniconda3/envs/anti-dreambooth/lib/python3.9/site-packages/pytorch_grad_cam/base_cam.py", line 112, in forward cam_per_layer = self.compute_cam_per_layer(input_tensor, targets, eigen_smooth) File "/data4/user/miniconda3/envs/anti-dreambooth/lib/python3.9/site-packages/pytorch_grad_cam/base_cam.py", line 143, in compute_cam_per_layer cam = self.get_cam_image(input_tensor, target_layer, targets, layer_activations, layer_grads, eigen_smooth) File "/data4/user/miniconda3/envs/anti-dreambooth/lib/python3.9/site-packages/pytorch_grad_cam/base_cam.py", line 66, in get_cam_image weights = self.get_cam_weights(input_tensor, target_layer, targets, activations, grads) File "/data4/user/miniconda3/envs/anti-dreambooth/lib/python3.9/site-packages/pytorch_grad_cam/grad_cam.py", line 32, in get_cam_weights raise ValueError("Invalid grads shape." ValueError: Invalid grads shape.Shape of grads should be 4 (2D image) or 5 (3D image).

If I comment out the error throwing logic in grad_cam.py, then the shape check in

https://github.com/jacobgil/pytorch-grad-cam/blob/1ff3f58818baa2889f3f51d0b9759783b4333ba0/pytorch_grad_cam/base_cam.py#L74

also fails. Does the shape really matter? Is there a way to fix this or just work around it? Thanks!

pytorch '2.1.0+cu121', grad-cam 1.5.2, ubuntu 18.04

Hello, did you resolve this issue? I have a similar problem!

hhuzzz commented 4 months ago

I met the same problem!!

ciroimmobile commented 3 months ago

Did you resolve this issue? I have a similar problem: Traceback (most recent call last): File "/root/.pycharm_helpers/pydev/pydevd.py", line 1496, in _exec pydev_imports.execfile(file, globals, locals) # execute the script File "/root/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "/workspace/add-lora-ssf-augfeature/grad-CAM/UIA-ViT-main/grad_cam_visualization.py", line 183, in main(args) File "/workspace/add-lora-ssf-augfeature/grad-CAM/UIA-ViT-main/grad_cam_visualization.py", line 159, in main targets=target_category) File "/workspace/add-lora-ssf-augfeature/grad-CAM/UIA-ViT-main/pytorch_grad_cam/base_cam.py", line 186, in call return self.forward(input_tensor, targets, eigen_smooth) File "/workspace/add-lora-ssf-augfeature/grad-CAM/UIA-ViT-main/pytorch_grad_cam/base_cam.py", line 110, in forward cam_per_layer = self.compute_cam_per_layer(input_tensor, targets, eigen_smooth) File "/workspace/add-lora-ssf-augfeature/grad-CAM/UIA-ViT-main/pytorch_grad_cam/base_cam.py", line 141, in compute_cam_per_layer cam = self.get_cam_image(input_tensor, target_layer, targets, layer_activations, layer_grads, eigen_smooth) File "/workspace/add-lora-ssf-augfeature/grad-CAM/UIA-ViT-main/pytorch_grad_cam/base_cam.py", line 66, in get_cam_image weights = self.get_cam_weights(input_tensor, target_layer, targets, activations, grads) File "/workspace/add-lora-ssf-augfeature/grad-CAM/UIA-ViT-main/pytorch_grad_cam/grad_cam.py", line 23, in get_cam_weights if len(grads.shape) == 4: AttributeError: 'NoneType' object has no attribute 'shape' python-BaseException

I have a similar problem, too. I tried to run it for DINOv2:

  File "/code/finetune-v2/visualize_gradcam.py", line 208, in gradcam_dataset
    gradcam_one_img(img_path)
  File "/code/finetune-v2/visualize_gradcam.py", line 152, in gradcam_one_img
    grayscale_cam = cam(input_tensor=img_tensor, targets=targets)
  File "/opt/conda/envs/dinov2/lib/python3.9/site-packages/pytorch_grad_cam/base_cam.py", line 186, in __call__
    return self.forward(input_tensor, targets, eigen_smooth)
  File "/opt/conda/envs/dinov2/lib/python3.9/site-packages/pytorch_grad_cam/base_cam.py", line 110, in forward
    cam_per_layer = self.compute_cam_per_layer(input_tensor, targets, eigen_smooth)
  File "/opt/conda/envs/dinov2/lib/python3.9/site-packages/pytorch_grad_cam/base_cam.py", line 141, in compute_cam_per_layer
    cam = self.get_cam_image(input_tensor, target_layer, targets, layer_activations, layer_grads, eigen_smooth)
  File "/opt/conda/envs/dinov2/lib/python3.9/site-packages/pytorch_grad_cam/base_cam.py", line 66, in get_cam_image
    weights = self.get_cam_weights(input_tensor, target_layer, targets, activations, grads)
  File "/opt/conda/envs/dinov2/lib/python3.9/site-packages/pytorch_grad_cam/grad_cam.py", line 23, in get_cam_weights
    if len(grads.shape) == 4:
AttributeError: 'NoneType' object has no attribute 'shape'

here is my code:

class Dino(nn.Module):
    def __init__(self, type, img_size, cls_num):
        super().__init__()
        # get feature model
        model = torch.hub.load(
            '', type, source='local'
        ).to(device)
        autocast_ctx = partial(
            torch.cuda.amp.autocast, enabled=True, dtype=torch.float16
        )
        self.feature_model = ModelWithIntermediateLayers(
            model, n_last_blocks=1, autocast_ctx=autocast_ctx
        ).to(device)

        with torch.no_grad():
            sample_input = torch.randn(1, 3, *img_size).to(device)
            sample_output = self.feature_model(sample_input)

        # get linear readout
        out_dim = create_linear_input(
            sample_output, use_n_blocks=1, use_avgpool=True
        ).shape[1]
        self.classifier = LinearClassifier(
            out_dim, use_n_blocks=1, use_avgpool=True, num_classes=cls_num
        ).to(device)

    def forward(self, x):
        x = self.feature_model(x)
        x = self.classifier(x)
        return x

data_transforms = {
    'train': transforms.Compose([
        transforms.Resize(args.image_size),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    'val': transforms.Compose([
        transforms.Resize(args.image_size),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
}

model = Dino(args.arch, args.image_size, 2)
# print(model)
checkpoint_path = f'/code/finetune-v2/best_models/best_model_{model_name}.pth'
checkpoint = torch.load(checkpoint_path, map_location='cpu')
model.load_state_dict(checkpoint)
model.eval()
model = model.cuda()

target_layers = [model.feature_model.feature_model.blocks[-1].norm1]
# target_layers = [model.feature_model.feature_model.blocks[10]]
print(target_layers[0])

targets = [ClassifierOutputTarget(0)]

def reshape_transform(tensor):
    result = tensor[:, 1:, :].reshape(tensor.size(0),
                                      args.image_size[0] // 14, args.image_size[1] // 14, tensor.size(2))

    # Bring the channels to the first dimension,
    # like in CNNs.
    result = result.transpose(2, 3).transpose(1, 2)
    return result

def gradcam_one_img(img_path):
    with GradCAM(model=model, target_layers=target_layers, reshape_transform=reshape_transform) as cam:
        rgb_img = Image.open(img_path).convert('RGB')
        img_tensor = data_transforms['val'](rgb_img).unsqueeze(0).cuda()
        print(img_tensor.shape)
#         outputs = model(img_tensor)
        grayscale_cam = cam(input_tensor=img_tensor, targets=targets)
#         grayscale_cam = cam(input_tensor=img_tensor, targets=targets)
        grayscale_cam = grayscale_cam[0, :]
        visualization = show_cam_on_image(rgb_img, grayscale_cam, use_rgb=True)
        print(visualization)
        return cam.outputs, visualization

gradcam_one_img(img_path)

I have the same problem, have you solved it?

cydawn commented 3 months ago

same problem need to check the implemention of the transformer layer to match the target_layer, such as model.layer[-1].norm1