open-mmlab / mmaction2

OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
https://mmaction2.readthedocs.io
Apache License 2.0
4.15k stars 1.22k forks source link

Runtime Error while GradCam slowfast recognition #2197

Open dralmadani opened 1 year ago

dralmadani commented 1 year ago

I am re-implementing grad-cam algorithms for slowfast model, following the gradcam demo provided by MMAction2 (MMAction2 GradCAM utils only).

Here are my codes.

%cd /content/mmaction2

%cd checkpoints
!wget https://download.openmmlab.com/mmaction/recognition/slowfast/slowfast_prebn_r50_4x16x1_256e_kinetics400_rgb/slowfast_prebn_r50_4x16x1_256e_kinetics400_rgb_20210722-bb725050.pth
%cd /content/mmaction2

model_path = 'checkpoints/slowfast_prebn_r50_4x16x1_256e_kinetics400_rgb_20210722-bb725050.pth'
gradcam_result_path = 'demo/output_files/demo_gradcam_slowfast.gif'
cfg = 'configs/recognition/slowfast/slowfast_r50_video_4x16x1_256e_kinetics400_rgb.py'
video_path = 'demo/demo.mp4'

!python demo/demo_gradcam.py $cfg \
  $model_path \
  $video_path  \
  --target-layer-name backbone/fast_path/layer4/1/relu \
  --out-filename $gradcam_result_path --device cpu

in the example mmaction use --target-layer-name backbone/layer4/1/relu for TSM but what layer should I use for slowfast --target-layer-name backbone/fast_path/layer4/1/relu \ --target-layer-name backbone/fast_path/layer4/2/relu \

--target-layer-name backbone/slow_path/layer4/1/relu \
--target-layer-name backbone/slow_path/layer4/2/relu \

Dai-Wenxun commented 1 year ago

How about trying all the mentioned layers and then checking the visualization result? From my perspective, layer4 is a suitable choice.

dralmadani commented 1 year ago

I found the problem was that the kind of GPU I used in COLAB did not have enough memory, but when I used the A100, everything worked good, and I got a 32M GIF file for 4 seconds!!

However, there are now two options. Slow_path method: "backbone/slow_path/layer4/2/relu" fast_path method: "backbone/fast_path/layer4/2/relu" What is the last layer that makes the product before the fully convolutional layer (FC)?

Dai-Wenxun commented 1 year ago

There is no last layer to fuse the features from both two paths before the FC. The FC processes the features (a tuple) from the slow path and fast path simultaneously. Check this file for more details.