dongzelian / SSF

[NeurIPS'22] This is an official implementation for "Scaling & Shifting Your Features: A New Baseline for Efficient Model Tuning".
https://arxiv.org/pdf/2210.08823.pdf
MIT License
172 stars 12 forks source link

Head layer of the inference model #4

Closed zdgithub closed 1 year ago

zdgithub commented 1 year ago

If the pre-trained model was trained on 1000 classes,then we fine-tune it on a downstream task with 100 classes. However, during inference, the paper says that we should use the frozen pre-trained model with the head layer of 1000 classes (without network architecture modification), does it?

dongzelian commented 1 year ago

@zdgithub The overall pipleline is 1) get a pre-trained model; 2) fine-tune it by inserting scale and shift factors, and modifying the head layer to adapt to the current task; 3) re-parameterization for model inference. During the re-parameterization, the scale and shift factors are absorbed into the backbone. In your case, the head layer is for 100 classes. Except for the head layer to adapt to the current task, other parts do not need to be modified. Thanks for your pointing this description. I will revised it to remove ambiguity.