Alpha-VL / ConvMAE

ConvMAE: Masked Convolution Meets Masked Autoencoders
MIT License
484 stars 41 forks source link

Visualization VIT feature #34

Open kimsekeun opened 10 months ago

kimsekeun commented 10 months ago

Hi, author.

To visualize your results attention map, how can you visualize this?

1) Use Encoder (ViT)? 2) Use Decoder (VIT)?

given input x -> y = encoder(x) -> decoder(y). then use final vit of decoder(y)?