wei-tim / YOWO

You Only Watch Once: A Unified CNN Architecture for Real-Time Spatiotemporal Action Localization
Other
846 stars 158 forks source link

Can you share the Cam code #27

Closed forest-su closed 4 years ago

forest-su commented 4 years ago

Thank you for your open source! I'm interested in the activations maps for 2D and 3D backbones of the trained model,but relevant code is not open source.Can you share the Cam code

okankop commented 4 years ago

Hi @forest-su, we have used the code in here: https://github.com/acheketa/pytorch-CAM

In order to create "Activation Maps" for 2D and 3D backbones, we have modified the code a lot. That's why we have not shared code here. However, here's what we did: Given an input clip of 16-frames, output features of 2D bacbone is weighted with 1 (since we want to see what activates 2D backbone regardless of which class it is) and created activation map is overelapped with the key-frame. For 3D backbone, we have used the output feature of layer 3 of ResNeXt-101 (where the depth dimension is 4) weight with again 1 and mapped to the corresponding frames on the input clip.

It is not too much effort, you just need to modify the forward pass of the YOWO model to return the desired output feature, set the weight to 1 and feed it to the CAM module in the above repo.