swathikirans / GSM

Gate-Shift Networks for Video Action Recognition - CVPR 2020
Other
151 stars 17 forks source link

t-sne #15

Open fuchao01 opened 4 years ago

fuchao01 commented 4 years ago

In the paper, which layer of features is used in Figure 5 to make t-sne

swathikirans commented 4 years ago

The feature obtained after spatial average pooling in BNInception is used for the t-SNE.

fuchao01 commented 4 years ago

Is it the feature obtained from here (https://github.com/swathikirans/GSM/blob/master/models.py#L194)?

swathikirans commented 4 years ago

Yes, you are right.

fuchao01 commented 4 years ago

Thank you for your reply. But the shape I extracted from base_out is (8, 2048). How to make the spatial average pooling

swathikirans commented 4 years ago

The output from the backbone (8X2048) is temporal average pooled to obtain a vector (2048) and is used for t-SNE visualization.