raoyongming / DynamicViT

[NeurIPS 2021] [T-PAMI] DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification
https://dynamicvit.ivg-research.xyz/
MIT License
551 stars 69 forks source link

About some innovative ideas #37

Open vegetableclean opened 11 months ago

vegetableclean commented 11 months ago

Hello, your paper is excellent, but I have some questions I'd like to ask you.

I've read your paper and found the concept of using a predictor for dynamic pruning to reduce FLOPs very innovative. I'm wondering if this idea can be applied to super-resolution models as well. I tried incorporating your code PredictorLG(nn.Module) within my model (SwinIR), following your parameters like base: 0.7, but I didn't see a significant reduction in FLOPs. However, I noticed in your paper that the part related to Swin Transformer achieved a significant reduction, by several tens of percent. Could you please clarify if there's something I might be misunderstanding or if there are implementation issues on my end?