facebookresearch / SlowFast

PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
Apache License 2.0
6.65k stars 1.22k forks source link

How to implement Grad-CAM for Mvit and Mvitv2? #685

Open Leozyc-waseda opened 1 year ago

Leozyc-waseda commented 1 year ago

I'm planning to train and visualize using Mvit and Mvitv2. Does anyone know how to write the grad-cam section for Mvit and MvitV2?

For reference, this is how it's done for SlowFast:

  MODEL_VIS:
    ENABLE: True
    # MODEL_WEIGHTS: True # Set to True to visualize model weights.
    # ACTIVATIONS: True # Set to True to visualize feature maps.
    # INPUT_VIDEO: True # Set to True to visualize the input video(s) for the corresponding feature maps.
    # LAYER_LIST:  ['s5/pathway1_res2', 's5/pathway0_res2'] # List of layer names to visualize weights and activations for.
    GRAD_CAM:
      ENABLE: True
      LAYER_LIST: ['s5/pathway1_res2', 's5/pathway0_res2'] # List of CNN layers to use for Grad-CAM visualization method.
                  # The number of layer must be equal to the number of pathway(s).
RijulGupta-DM commented 11 months ago

Here's what's working in our DeepFake Detection codebase: MODEL_VIS: ENABLE: True MODEL_WEIGHTS: True # Set to True to visualize model weights. ACTIVATIONS: True # Set to True to visualize feature maps. INPUT_VIDEO: True # Set to True to visualize the input video(s) for the corresponding feature maps. LAYER_LIST: ['patch_embed/proj'] # List of layer names to visualize weights and activations for. GRAD_CAM: ENABLE: True USE_TRUE_LABEL: False LAYER_LIST: ['patch_embed/proj'] # List of CNN layers to use for Grad-CAM visualization method.

The number of layer must be equal to the number of pathway(s).

You might need a couple of tweaks in the visualizer, but this layer_list produces decent GradCAM results.