youweiliang / evit

Python code for ICLR 2022 spotlight paper EViT: Expediting Vision Transformers via Token Reorganizations
Apache License 2.0
162 stars 19 forks source link

Warmup strategy #7

Open 1171000410 opened 2 years ago

1171000410 commented 2 years ago

Hello, I would like to ask if the warmup strategy is not used, but the keep rate is directly set to the target value, will the experimental results differ greatly?

youweiliang commented 2 years ago

No, the resulting difference would be negligible.

1171000410 commented 2 years ago

Hi, I would like to add a question. After reading your code, I wonder if token reorganization is used for only three layers in the test phase of Deit-S, but for all layers in the training phase.

line 398-399 in evit.py if not isinstance(keep_rate, (tuple, list)): keep_rate = (keep_rate, ) * self.depth

youweiliang commented 2 years ago

Thanks for your question.

No, it is for only three layers. The keep_rate you mentioned is for controlling the keep rate during training and is different from the keep rate used in inference (the self.keep_rate in the Attention is the real keep rate, while the keep_rate you mentioned will be None in inference). Another way to control the number of kept tokens is to change tokens. See also line 208: https://github.com/youweiliang/evit/blob/29c7f2a67192eda0d2957402228065581a071bd5/evit.py#L208

In other words, it provides many ways to control keep rate in training/inference, and it is up to the users how to control it. But the default is using token reorganization in three layers.