Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
I am trying to visualize attention maps for video data. I am using ViViT model 2 and my inputs are the size [B x T x C x Hx W]. I have tried using grad-cam but I got the error: axis 2 is out of bounds for array of dimension 0.
Error occurs in grad_cam.py while returning the np.mean(grads, axis=(2,3)) in the grad_cam_weights function.
I am curious if it is possible in any way to use grad-cam for video data.
Thanks in advance
Hi Jacob,
I am trying to visualize attention maps for video data. I am using ViViT model 2 and my inputs are the size [B x T x C x Hx W]. I have tried using grad-cam but I got the error: axis 2 is out of bounds for array of dimension 0. Error occurs in grad_cam.py while returning the np.mean(grads, axis=(2,3)) in the grad_cam_weights function. I am curious if it is possible in any way to use grad-cam for video data. Thanks in advance