Closed saruarlive closed 3 years ago
Hi @ahatamiz ,
Could you please help share some comments about this question as you are expert on it?
Thanks in advance.
Hi @saruarlive thank you for your comment. This version of ViT model was designed to serve as backbone for segmentation models and in particular the UNETR model. So it does not utilize class tokens.
I will submit a new PR to support classification applications for ViT model and hence also add the class token.
Tracked by https://github.com/Project-MONAI/MONAI/issues/2682, I'm closing this.
Hi, I have found that class embedding (token, self.cls_token ) is not applied in patchembedding block used in vit class. The vision transformer article mentions it. Is there any reason not to add/concatenate the class token? source: here