Open Trotts opened 2 years ago
Update:
I fixed the above error by calling
car_concept_features.requires_grad_()
cloud_concept_features.requires_grad_()
Before
# Where is the car in the image
with GradCAM(model=model,
target_layers=target_layers,
use_cuda=False) as cam:
car_grayscale_cam = cam(input_tensor=input_tensor,
targets=car_targets)[0, :]
However I now seem to be running into the following:
An exception occurred in CAM with block: <class 'numpy.AxisError'>. Message: axis 2 is out of bounds for array of dimension 0
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-9-5cdc1cb41639> in <module>
21
22
---> 23 car_cam_image = show_cam_on_image(image_float, car_grayscale_cam, use_rgb=True)
24 Image.fromarray(car_cam_image)
NameError: name 'car_grayscale_cam' is not defined
Looking back through the closed issues on this topic, it seems to be a problem with the layers I specify for target_layers
- am I correct in thinking this?
If so, I currently select loaded_model.module.convnet[-1]
as the target, which corresponds to:
MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
Any help with the above is greatly appreciated :)
Hi,
Can you please try loaded_model.module.convnet[-2]
and tell if it worked?
We need the 2D CNN activations before the pooling.
Hi @jacobgil, I tried the suggestion but the same error occurs:
target_layers = [loaded_model.module.convnet[-2]]
car_targets = [SimilarityToConceptTarget(car_concept_features)]
cloud_targets = [SimilarityToConceptTarget(cloud_concept_features)]
# Where is the car in the image
with GradCAM(model=model,
target_layers=target_layers,
use_cuda=False) as cam:
car_grayscale_cam = cam(input_tensor=input_tensor,
targets=car_targets)[0, :]
car_cam_image = show_cam_on_image(image_float, car_grayscale_cam, use_rgb=True)
Image.fromarray(car_cam_image)
Results in:
An exception occurred in CAM with block: <class 'numpy.AxisError'>. Message: axis 2 is out of bounds for array of dimension 0
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-13-784011a42087> in <module>
20 targets=car_targets)[0, :]
21
---> 22 car_cam_image = show_cam_on_image(image_float, car_grayscale_cam, use_rgb=True)
23 Image.fromarray(car_cam_image)
NameError: name 'car_grayscale_cam' is not defined
Am I correct in thinking the target_layers = [loaded_model.module.convnet[-2]]
is where you wanted the change to be made?
Hi, sorry for the late response, I was traveling..
What does target_layers look like now? What is the output shape you expect from that layer ? The CAM algorithms expect it to have the shape batch x channels x height x width. Is that what we have there ?
If the dimension is different, we will need to write a reshape_transform.
Also, what is the dimension of input_tensor ?
Hi, also sorry for my late reply.
printing target_layers
shows as: [Conv2d(59, 59, kernel_size=(5, 5), stride=(1, 1))]
Using torchsummary
to get the expected output for a (3, 300, 300) image:
summary(loaded_model.module, (3, 300, 300))
gives:
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
Conv2d-1 [-1, 59, 295, 295] 6,431
MaxPool2d-2 [-1, 59, 147, 147] 0
ReLU-3 [-1, 59, 147, 147] 0
Dropout-4 [-1, 59, 147, 147] 0
Conv2d-5 [-1, 59, 143, 143] 87,084
MaxPool2d-6 [-1, 59, 71, 71] 0
Linear-7 [-1, 106] 31,526,520
ReLU-8 [-1, 106] 0
================================================================
Total params: 31,620,035
Trainable params: 31,620,035
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 1.03
Forward/backward pass size (MB): 79.83
Params size (MB): 120.62
Estimated Total Size (MB): 201.48
----------------------------------------------------------------
The model expected input is [1, 3, 300, 300]
, with images reshaped to 300, 300 before input:
print(image.shape)
gives (300, 300, 3)
.
input_tensor.shape
gives torch.Size([1, 3, 300, 300])
So I believe everything is in the shape expected, at least before passing to the model/CAM, however I may have missed something?
Hi, also sorry for my late reply.
printing
target_layers
shows as:[Conv2d(59, 59, kernel_size=(5, 5), stride=(1, 1))]
Using
torchsummary
to get the expected output for a (3, 300, 300) image:
summary(loaded_model.module, (3, 300, 300))
gives:
---------------------------------------------------------------- Layer (type) Output Shape Param # ================================================================ Conv2d-1 [-1, 59, 295, 295] 6,431 MaxPool2d-2 [-1, 59, 147, 147] 0 ReLU-3 [-1, 59, 147, 147] 0 Dropout-4 [-1, 59, 147, 147] 0 Conv2d-5 [-1, 59, 143, 143] 87,084 MaxPool2d-6 [-1, 59, 71, 71] 0 Linear-7 [-1, 106] 31,526,520 ReLU-8 [-1, 106] 0 ================================================================ Total params: 31,620,035 Trainable params: 31,620,035 Non-trainable params: 0 ---------------------------------------------------------------- Input size (MB): 1.03 Forward/backward pass size (MB): 79.83 Params size (MB): 120.62 Estimated Total Size (MB): 201.48 ----------------------------------------------------------------
The model expected input is
[1, 3, 300, 300]
, with images reshaped to 300, 300 before input:
print(image.shape)
gives(300, 300, 3)
.
input_tensor.shape
givestorch.Size([1, 3, 300, 300])
So I believe everything is in the shape expected, at least before passing to the model/CAM, however I may have missed something?
Hello! Have you sloved this problem now? I had the same problem.
Hi,
I am trying to run GradCam over a custom architecture I have created. The architecture is as follows:
This architecture is an embedding network, and so I am using the [https://github.com/jacobgil/pytorch-grad-cam/blob/master/tutorials/Pixel%20Attribution%20for%20embeddings.ipynb](Pixel Attribution for Embeddings Notebook) to try and generate a heatmap. Currently, I have it set to just run on the default images for the moment.
When running the code for "Where is the car in the image", I am running into the following error:
From a few other closed threads on this issue, it seems there is something I need to do with:
with torch.no_grad()
However I am at a complete loss as to where this needs to be, or if it is a deeper problem with my custom embedding network. Any help would be greatly appreciated. I am running python 3.6.9 and grad-cam version 1.4.5.
Code (dirs edited out):