Only use the first patch after T2T-ViT Backbone

yitu-opensource / T2T-ViT

ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

Other

1.14k stars 177 forks source link

Closed JingXfei closed 3 years ago

JingXfei commented 3 years ago

Hello and thanks for your excellent work! However, I don't understand why only the first patch is used after T2T-ViT Backbone?

yuanli2333 commented 3 years ago

The first one is not the first patch, but the class token for classification, as in this line.