ViTAE-Transformer / ViTDet

Unofficial implementation for [ECCV'22] "Exploring Plain Vision Transformer Backbones for Object Detection"
Apache License 2.0
530 stars 46 forks source link

Question of PCM #5

Closed HerrYu123 closed 2 years ago

HerrYu123 commented 2 years ago

Convolutional propagation in the paper is added after each subset. But the PCM is added into every block. Is it Right?

Annbless commented 2 years ago

Hello.

The answer is correct, the convolutional propagation in ViTDet is added after each subset of ViT. For ViTAE, PCM is added to each block. We do not add PCM in the ViT model.

In the current implementation, both ViT and ViTAE models use the global attention layer to propagate information. We will add the option of convolutional propagation soon.

HerrYu123 commented 2 years ago

I got it! Thanks for your quick reply.