Visualization VIT feature

Alpha-VL / ConvMAE

ConvMAE: Masked Convolution Meets Masked Autoencoders

MIT License

484 stars 41 forks source link

Open kimsekeun opened 10 months ago

kimsekeun commented 10 months ago

Hi, author.

To visualize your results attention map, how can you visualize this?

1) Use Encoder (ViT)? 2) Use Decoder (VIT)?

given input x -> y = encoder(x) -> decoder(y). then use final vit of decoder(y)?