NVlabs / A-ViT

Official PyTorch implementation of A-ViT: Adaptive Tokens for Efficient Vision Transformer (CVPR 2022)
Apache License 2.0
138 stars 12 forks source link

token nums #7

Open sutiankang opened 1 year ago

sutiankang commented 1 year ago

Hi, Thanks for your excellent work. The paper mentions that the number of tokens will be changed in different Transformer layers, but when I debugged the code, I found that the number of tokens is the same in each layer. What is the reason for this?

cvding commented 1 year ago

act_vision_transformer.py: pay attention to the [self.counter_token] state.

followtheart commented 1 year ago

Only, (i) zeroing out the token value, and (ii) blocking its attention to other tokens, shielding its impact to tl¡Nk in Eqn. 2 But NOT ’remove‘?