comparisons - Githubissues

zhmiao / OpenLongTailRecognition-OLTR

Pytorch implementation for "Large-Scale Long-Tailed Recognition in an Open World" (CVPR 2019 ORAL)

BSD 3-Clause "New" or "Revised" License

836 stars 129 forks source link

comparisons #28

Closed matttrd closed 4 years ago

matttrd commented 4 years ago

Hello, reading carefully your code (for ImageNet-LT), it seems that the plain models was trained in 30 epochs while your own model in 90 epochs (30 stage1 + 60 stage2). Could you please confirm this? And moreover, was all the comparisons (focal loss, etc) performed with 30 or 90 epochs?

Thanks!

zhmiao commented 4 years ago

@matttrd Thanks for asking. The plain was only trained for 30 epochs and the rest of the methods were trained for 90 epochs. As discussed here: https://github.com/zhmiao/OpenLongTailRecognition-OLTR/issues/4#issuecomment-496071369 , more epochs would not help the performance of plain model since it was already converged around 30 epochs.

matttrd commented 4 years ago

@zhmiao thanks for answering. Probably I'm missing something but I trained (using your code) the plain model (stage 1) with 90 epochs with the usual drop of the learning rate every 30 epochs getting 32.7% of overall top1 test accuracy and:

few shot: 4%
Median: 23.9%
Many: 53.6%

I understand that the loss converges around 30 epochs but this is true only if you let the LR drop every 10 epochs as you did. I think this should be the fair comparison. Am I wrong?

liuziwei7 commented 4 years ago

Thanks for reporting these results! Actually we have obtained some similar observations in our follow-up project.

A model trained on long-tailed dataset has some interesting behavior change with different initializations and epoches:

Large learning rate tends to make the model bias to many-shot classes;
Intermediate and late epoches are critical for the performance of few-shot classes.

We speculate the root of these phenomena come from the learning dynamics of long-tail-trained models. Therefore, we are considering update our manuscript to report a sequence of snapshot accuracies instead of a single final accuracy. It is yet an open question, and we believe it is definitely an interesting direction to further investigate.

Our aim of the open long-tailed recognition paper is to formally define and make clear this important real-world problem, instead of providing a silver-bullet solution. We welcome everyone to work on this topic and further improve the underlying approaches :)

matttrd commented 4 years ago

@liuziwei7 thanks for your clarifications! We are actually working on a similar project and we obtained the same observations. I really like (and agree with) the approach of tackling open long-tailed recognition problems with a single framework. It is still an ongoing project but we were able to obtain good results with a theory-driven input sampling (without attention, hallucinator, etc). We will try our algorithm on your framework. Thanks again!