implement supervised vision transformer

alexatartaglini / developmental-shape-bias

8 stars 1 forks source link

implement supervised vision transformer #21

Closed alexatartaglini closed 2 years ago

alexatartaglini commented 3 years ago

Need to implement the supervised ViTs from this paper: https://arxiv.org/abs/2010.11929

Pre-trained weights can be downloaded from one of two sources:

https://github.com/rwightman/pytorch-image-models
https://huggingface.co/transformers/model_doc/vit.html

wkvong commented 3 years ago

one detail we should double check when implementing this is that the decision from a base ViT is done using the embedding at the very first position of the transformer network (see in Figure 1 for the extra learnable [class] embedding), so we'd probably want to replicate extracting only this embedding (rather than all of the embeddings across all image patches plus this one) for our simulations