Megvii-BaseDetection / DisAlign

Implementation of "Distribution Alignment: A Unified Framework for Long-tail Visual Recognition"(CVPR 2021)
Apache License 2.0
118 stars 10 forks source link

Is it correct to freeze the weight and bias of the DisAlign Linear Layer as well? #22

Closed jeongHwarr closed 2 years ago

jeongHwarr commented 2 years ago

Hello. Thank you for your project! I'm testing your code on my custom dataset. My task is classification. I have a question about your code implementation.

https://github.com/Megvii-BaseDetection/DisAlign/blob/a2fc3500a108cb83e3942293a5675c97ab3a2c6e/classification/imagenetlt/resnext50/resx50.scratch.imagenet_lt.224size.90e.disalign.10e/net.py#L56-L62

From my understanding, in stage 2, remove the linear layer used in stage 1 and add DisAlign Linear Layer. And freeze all parts except for logit_scale, logit_bias, and confidence_layer. At this time, the weight and bias of DisAlignLinear are also frozen. (self.weight, self.bias) Is my understanding correct?

If so, are the weight and bias of DisAlignLinearLayer fixed after the initialization? (The weight and bias of the linear layer in stage 1 are not copied either)

If my understanding is correct, why is the weight of DisAlignLinear also frozen?

I will wait for your reply. thanks!

tonysy commented 2 years ago

Hi, in our proposed method, we keep the original classifier and add scale/bias for stage-2 learning. Please read the adaptive calibration function section of the original paper.

jeongHwarr commented 2 years ago

@tonysy Thank you for your reply! I thought that the original classifier would be removed and replaced with a new classifier.

https://github.com/Megvii-BaseDetection/DisAlign/blob/a2fc3500a108cb83e3942293a5675c97ab3a2c6e/classification/imagenetlt/resnext50/resx50.scratch.imagenet_lt.224size.90e.disalign.10e/net.py#L21-L27

I couldn't run the cvpod because my os is Windows.

Does the above code means a classifier in which other elements are added while retaining the original classifier weight?

tonysy commented 2 years ago

Does the above code means a classifier in which other elements are added while retaining the original classifier weight?

Yes

jeongHwarr commented 2 years ago

@tonysy Okay. Thank you!