OpenGVLab / DiffRate

[ICCV 23]An approach to enhance the efficiency of Vision Transformer (ViT) by concurrently employing token pruning and token merging techniques, while incorporating a differentiable compression rate.
87 stars 8 forks source link

Inquiry on Fine-tuning Details for Table 2 in Your Repository #5

Closed SunkiLin closed 4 days ago

SunkiLin commented 5 months ago

Hello,

I hope this message finds you well.

I would like to express my admiration for your work. It is truly straightforward and effective. However, I have encountered an issue while attempting to reproduce the "fine-tuning the model with searched compression rate for 30 epochs" as described in Table 2 of your documentation.

Specifically, after employing the EViT framework for fine-tuning, I noticed that the accuracy has unexpectedly decreased compared to the untrained model. I am reaching out to seek clarification on the training details you utilized during your experiments. Any insights or guidance you could provide would be greatly appreciated.

Thank you very much for your time and assistance.

ChenMnZ commented 5 months ago

Table 6 in paper have given the training details, please follow it.

Three parameters are crucial: