zhmiao / OpenLongTailRecognition-OLTR

Pytorch implementation for "Large-Scale Long-Tailed Recognition in an Open World" (CVPR 2019 ORAL)
BSD 3-Clause "New" or "Revised" License
836 stars 129 forks source link

All the Backbone Net used by the compared methods are also frozen when training the datasets? #27

Closed jchhuang closed 4 years ago

jchhuang commented 4 years ago

Here, I am curious about that the weights of Backbone Net used by the compared methods are also frozen when training the datasets?If so,I think it is not fair to compare their performances since their performances does not exhaust if their weights are frozen

zhmiao commented 4 years ago

@jchhuang As written in the paper, in imagenet experiments where the weights are not frozen, they are not. In places experiments where the feature extractor weights are frozen, the other methods are frozen as well.

jchhuang commented 4 years ago

@zhmiao Thanks your reply. May you further explain why the weights of imagenet experiment are not frozen while the weights of Place are frozen?

jchhuang commented 4 years ago

@zhmiao If other methods are frozen as well, which means the performances of the compared methods are not exhausted, so the comparison results in your paper cannot indicate that your method outperform than others, does it?

zhmiao commented 4 years ago

@jchhuang As I said, in imagenet experiments, the weights of all methods are not frozen. In places experiments, the weights are frozen for all methods including ours. If that's not a fair comparison, I am not sure what is. Are you saying we should freeze the weights in our methods and not freeze the other methods? Is it fair? I am not sure what you are indicating. I think at least this part, we have been very clear in the paper. Please read the paper. Thanks. In addition, if you think freezing feature extractor and only training on the classifier (which is a pretty common approach) is not enough, we still have imagenet experiments, where the weights are all randomly initialized and not frozen. On the other hand, when the feature extractors are frozen, you can think the model is final classifiers, and the inputs are not images but extracted features. It is not a problem of exhausted or not. We are simply not fully using the power of deep learning, but we still fully utilize the power of all the other losses and methods we have compared.

liuziwei7 commented 4 years ago

Using freezed network weights is equivalent to employ ImageNet pre-trained features, which is a common practice in few/zero-shot learning community. For example, a closely related literature, "Learning to Model the Tail", also adopted the freezed ResNet-152 as the backbone network for experimental comparisons.

Actually, this practice has an additional advantage that it separates the effect of feature and classifier, which could highlight the true merit of the classifier design (i.e. the desired information flow inside the classifier).

Our open long-tailed recognition work serves as a start, instead of an ending point. You are more than welcomed to work on this topic and make contribution.