zhangyongshun / BagofTricks-LT

A scientific and useful toolbox, which contains practical and effective long-tail related tricks with extensive experimental results
MIT License
575 stars 76 forks source link

About DRS training #4

Closed adf1178 closed 3 years ago

adf1178 commented 3 years ago

Hello! Thanks for your contribution. I have such questions: The DRS strategy described in Decoupling representation and classifier for long-tailed recognition is that: first train whole network for 90 or 200 epochs, then freeze the backbone and re-initialize a classifier and train. But the DRS strategy in the code is just to change a different sampler? or I just misunderstand the code?

zhangyongshun commented 3 years ago

Thanks for your question! "But the DRS strategy in the code is just to change a different sampler?": Yes, the DRS in our codes is just to change the sampler from the default sampler to a balanced sampler. The details of DRS are firstly described in LDAM Loss (https://arxiv.org/pdf/1906.07413.pdf), and firstly introduced in Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning (https://arxiv.org/pdf/1806.06193.pdf), where the backbone is opened in the second stage in DRS. I think Decoupling Representation shows another way to explore the balance between the backbone and classifier, and it is different with DRS. Besides, I think Decoupling Representation can be seen as a post-processing trick rather than a DRS method.

adf1178 commented 3 years ago

Thanks for your question! "But the DRS strategy in the code is just to change a different sampler?": Yes, the DRS in our codes is just to change the sampler from the default sampler to a balanced sampler. The details of DRS are firstly described in LDAM Loss (https://arxiv.org/pdf/1906.07413.pdf), and firstly introduced in Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning (https://arxiv.org/pdf/1806.06193.pdf), where the backbone is opened in the second stage in DRS. I think Decoupling Representation shows another way to explore the balance between the backbone and classifier, and it is different with DRS. Besides, I think Decoupling Representation can be seen as a post-processing trick rather than a DRS method.

Got it! Thanks again for your Patient and timely reply!