facebookresearch / vissl

VISSL is FAIR's library of extensible, modular and scalable components for SOTA Self-Supervised Learning with images.
https://vissl.ai
MIT License
3.24k stars 330 forks source link

Seer activation maps examples #523

Closed seekingdeep closed 2 years ago

seekingdeep commented 2 years ago

Hi there,

As example, say it was trained on pictures of the sea, and also it was trained on pictures of ships. what if we show it an image that contains many ships densely close together in the sea.

What will the activation map look like?

QuentinDuval commented 2 years ago

Hi @seekingdeep,

First of all, thank you for reaching out :)

Regarding attention maps, our SEER models are based on a convolutional network (a RegNet to be more explicit) and not Visual Transformers (like in the video you linked above), so it does not have any self attention layer.

This unfortunately means that getting the attention map like in your video link is not directly possible because of the lack of self attention layer.

They might have some other ways to do it though, but we have not dig into that. Did you have something specific in mind?

Thank you, Quentin

seekingdeep commented 2 years ago

it seemed interesting to see how such model see things. Thanks, good luck