Open chagmgang opened 2 years ago
If use pretrained weight of MAE, interval is sill 3? If interval is set to be 3, window attention is applied. The architecture of vision transformer is changed from mae to vit with window attention.
If use pretrained weight of MAE, interval is sill 3? If interval is set to be 3, window attention is applied. The architecture of vision transformer is changed from mae to vit with window attention.