jacobgil / pytorch-grad-cam

Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
https://jacobgil.github.io/pytorch-gradcam-book
MIT License
9.79k stars 1.52k forks source link

Video CLassification #467

Open ZziTaiLeo opened 7 months ago

ZziTaiLeo commented 7 months ago

I want to use CAM to analyze the network's attention to each frame in gait recognition tasks. The inputs video size is (B=1,T,H,W). After extracting features from each frame through the 2DCNN such as Res-net, the intermediate features are (T, d), which are then processed in temporal aggregation to obtain the final features, then classified. My target_layer is set on the last layer of the resnet network, and I am not sure if the generated heat map is valid.