Inquiry on Fine-tuning Details for Table 2 in Your Repository

OpenGVLab / DiffRate

[ICCV 23]An approach to enhance the efficiency of Vision Transformer (ViT) by concurrently employing token pruning and token merging techniques, while incorporating a differentiable compression rate.

87 stars 8 forks source link

Hello,

I hope this message finds you well.

I would like to express my admiration for your work. It is truly straightforward and effective. However, I have encountered an issue while attempting to reproduce the "fine-tuning the model with searched compression rate for 30 epochs" as described in Table 2 of your documentation.

Specifically, after employing the EViT framework for fine-tuning, I noticed that the accuracy has unexpectedly decreased compared to the untrained model. I am reaching out to seek clarification on the training details you utilized during your experiments. Any insights or guidance you could provide would be greatly appreciated.

Thank you very much for your time and assistance.

OpenGVLab / DiffRate

Inquiry on Fine-tuning Details for Table 2 in Your Repository #5