jacobgil / pytorch-grad-cam

Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
https://jacobgil.github.io/pytorch-gradcam-book
MIT License
9.91k stars 1.52k forks source link

numpy.AxisError: axis 2 is out of bounds for array of dimension 2 #422

Open jscott-gauss opened 1 year ago

jscott-gauss commented 1 year ago

Related issues: https://github.com/jacobgil/pytorch-grad-cam/issues/254 https://github.com/jacobgil/pytorch-grad-cam/issues/394 (maybe)

Model is resnest14d from timm and I am regressing to a single scalar value

targets = [RawScoresOutputTarget()]         # Initially tried this
targets = [BinaryClassifierOutputTarget(1)] # Tried this after reading above issue, same result
cam_result = cam(input_tensor=image_tensor, targets=targets, eigen_smooth=False)

Debugger printout input args right before here: https://github.com/jacobgil/pytorch-grad-cam/blob/2183a9cbc1bd5fc1d8e134b4f3318c3b6db5671f/pytorch_grad_cam/grad_cam.py#L16

(Pdb) input_tensor.size()
torch.Size([1, 3, 1152, 144])
(Pdb) target_layer
Linear(in_features=2048, out_features=1, bias=True)
(Pdb) targets
[<pytorch_grad_cam.utils.model_targets.BinaryClassifierOutputTarget object at 0x7effaa581af0>]
(Pdb) activations
array([[7.7802935]], dtype=float32)
(Pdb) grads
array([[1.]], dtype=float32)

Stack Trace

Traceback (most recent call last):
  File "test_and_produce_conc_data.py", line 113, in <module>
    main()
  File "/NeptuneCanister/dl_models/utils/hydra.py", line 108, in wrapper
    return func(cfg, args)
  File "test_and_produce_conc_data.py", line 82, in main
    cam_result = cam(input_tensor=images[i,:,:,:][None, :, :, :], targets=targets, eigen_smooth=False)
  File "/venv/lib/python3.8/site-packages/pytorch_grad_cam/base_cam.py", line 188, in __call__
    return self.forward(input_tensor,
  File "/venv/lib/python3.8/site-packages/pytorch_grad_cam/base_cam.py", line 95, in forward
    cam_per_layer = self.compute_cam_per_layer(input_tensor,
  File "/venv/lib/python3.8/site-packages/pytorch_grad_cam/base_cam.py", line 127, in compute_cam_per_layer
    cam = self.get_cam_image(input_tensor,
  File "/venv/lib/python3.8/site-packages/pytorch_grad_cam/base_cam.py", line 50, in get_cam_image
    weights = self.get_cam_weights(input_tensor,
  File "/venv/lib/python3.8/site-packages/pytorch_grad_cam/grad_cam.py", line 22, in get_cam_weights
    return np.mean(grads, axis=(2, 3))
  File "<__array_function__ internals>", line 180, in mean
  File "/venv/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 3432, in mean
    return _methods._mean(a, axis=axis, dtype=dtype,
  File "/venv/lib/python3.8/site-packages/numpy/core/_methods.py", line 168, in _mean
    rcount = _count_reduce_items(arr, axis, keepdims=keepdims, where=where)
  File "/venv/lib/python3.8/site-packages/numpy/core/_methods.py", line 76, in _count_reduce_items
    items *= arr.shape[mu.normalize_axis_index(ax, arr.ndim)]
numpy.AxisError: axis 2 is out of bounds for array of dimension 2
ivan-alles commented 10 months ago

Have the same problem with a binary classifier with the fully connected output layer of size (B, 1), when I select this layer as the target, in combination with BinaryClassifierOutputTarget.

If I use the last convolution layer in the model before FC layers, it works. Also it works if I resize the output to (B, 1, 1, 1), but it produces meaninless heatmaps.

Looks like grad-cam expects spatial dimensions in the output. But according to the paper: our approach is a generalization of CAM [59] and is applicable to a significantly broader range of CNN model families: (1) CNNs with fully-connected layers (e.g. VGG).

Or am I doing something wrong? What is the right way of handling fully connected layers?

wujiang0156 commented 6 months ago

I meet the same question, do you solve it? thanks