emla2805 / vision-transformer

Tensorflow implementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)
https://openreview.net/pdf?id=YicbFdNTTy
206 stars 63 forks source link