Amshaker / SwiftFormer

[ICCV'23] Official repository of paper SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications
236 stars 25 forks source link

Question about distillation #5

Open yaozengwei opened 1 year ago

yaozengwei commented 1 year ago

Thanks for the nice code. I notice the code use hard distillation by default. Does Table 2 in paper comes from distillation?

Amshaker commented 1 year ago

Hi @yaozengwei , Thank you for your interest in our work.

Yes, Table 2 results are with distillation for the baseline EfficientFormer and our SwiftFormer.

We will share the results w/o the distillation soon.

Best regards, Abdelrahman.

yaozengwei commented 1 year ago

Thanks a lot!