ViTAE-Transformer / ViTDet

Unofficial implementation for [ECCV'22] "Exploring Plain Vision Transformer Backbones for Object Detection"
Apache License 2.0
530 stars 46 forks source link

[Question] MAE vitdet interval #21

Open chagmgang opened 2 years ago

chagmgang commented 2 years ago

If use pretrained weight of MAE, interval is sill 3? If interval is set to be 3, window attention is applied. The architecture of vision transformer is changed from mae to vit with window attention.