NVlabs / A-ViT

Official PyTorch implementation of A-ViT: Adaptive Tokens for Efficient Vision Transformer (CVPR 2022)
Apache License 2.0
138 stars 12 forks source link

The inference time of A-Vit same as the Deit. #2

Closed dk-liang closed 1 year ago

dk-liang commented 1 year ago

Thanks for this interesting work, and I believe it would be valuable for people in this area. Here, I have some problems. Could the authors provide some explanation? (1) Why the inference time of A-Vit is same as Deit? According to the paper, the A-VIT is faster than Deit. But I find that the inference time is the same regardless of whether the pre-trained model is loaded or not.

(2) According to the paper, different samples should have different tokens, i.e., the tensor shape should not be the same in the testing phase. So, how to evaluate the validation set under 64 batch size? To the best of my knowledge, it is almost impossible to combine into a batch.

hongxuyin commented 1 year ago

Hi yes that snippet is not yet released we will release dynamic zipping, distillation, and base model etc. in coming versions. Thanks for feedback and we will update readme to reflect.

YuanYeshang commented 7 months ago

hi,i want to ask that if you can release the dynamic zipping now?