yitu-opensource / T2T-ViT

ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
Other
1.14k stars 177 forks source link

Only use the first patch after T2T-ViT Backbone #8

Closed JingXfei closed 3 years ago

JingXfei commented 3 years ago

Hello and thanks for your excellent work! However, I don't understand why only the first patch is used after T2T-ViT Backbone?

https://github.com/yitu-opensource/T2T-ViT/blob/b873c90f890522822b2c67b519fa13283e2287e8/models/t2t_vit.py#L166

yuanli2333 commented 3 years ago

The first one is not the first patch, but the class token for classification, as in this line.