Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)
1.95k
stars
374
forks
source link
Hybrid ViT fails in the constructor for image size = 200 #15
Open
pgagarinov opened 3 years ago
vit = VisionTransformer(CONFIGS['R50-ViT-B_16'], zero_head=False, img_size=200)
leads to "float division by zero" exception:
ZeroDivisionError Traceback (most recent call last)