aanna0701 / SPT_LSA_ViT

Implementation of Visual Transformer for Small-size Datasets
117 stars 15 forks source link

ImageNet issue #4

Closed SY-Xuan closed 2 years ago

SY-Xuan commented 2 years ago

Nice work! You use a small but high resolution network for the TinyImageNet. For example, the patch size of ViT is 8 and the window size of Swin is 4. When come to the ImageNet, the window size of Swin is 4 does not work. So what parameter you use for the ImageNet. Could you please give me some details about ViT and Swin?

aanna0701 commented 2 years ago

I used the ViT-Tiny model suggested by the DeiT paper and the Swin-T model suggested by the Swin paper.

SY-Xuan commented 2 years ago

OK. Thank you!