Closed muaz1994 closed 3 years ago
Hello and many thanks for your code. Can I know what is the best performing model? According to the paper on page 12, it seems that ViT-B/16 performs the best? So fewer layers work better?
In general, bigger models and smaller patches are better. As you said, for ImageNet pretrained, B16 seems to be best.
Hello and many thanks for your code. Can I know what is the best performing model? According to the paper on page 12, it seems that ViT-B/16 performs the best? So fewer layers work better?