Closed hongxin001 closed 2 years ago
Please provide more details, e.g., which config did you use, what results did you get, and the training logs.
For example, on Places-LT, I use the config: ./config/Places_LT/decouple_balanced_softmax.yaml and I got 35.1. The training log is as below: places_dt_bs.log
I notice that the batch size in your log is 128 while the batch size in the config is 256.
Thanks for your advice.
We made a typo in Places-LT's config. The sampler option should be 'null' instead of 'ClassAwareSampler'. Thank you for raising the issue. We will fix the config asap.
Are you still unable to reproduce CIFAR-10-LT's result?
I will have a try
I found that the sampler option in CIFAR-10-LT is also ClassAwareSampler. Should I change it to 'null'?
I don't see any ClassAwareSampler in CIFAR-10-LT's configs. Would you mind specifying the config name?
I see. That's my fault.
The result on Places-LT is still not good. Could you have a try?
May I have a look at the new training log?
We only ran and reported the 'BALMS' setting on Places-LT in the paper. In fact, the config is not part of the paper, therefore I am not sure what is the result that the config supposes to give. Although the low-shot accuracy in your training log seems very low (even lower than the baseline), the result you get could be the right one.
I will remove the config to avoid further confusion. If you are still interested in using Balanced Softmax in Places-LT, I would recommend end-to-end training instead of decoupled training. Recent papers [1, 2] report very competitive results (39.4 top-1 acc) using 'end-to-end + Balanced Softmax' on Places-LT.
[1] Disentangling Label Distribution for Long-tailed Visual Recognition [2] Test-Agnostic Long-Tailed Recognition by Test-Time Aggregating Diverse Experts with Self-Supervision
I see. Thanks for your explanation and the introduced papers. Although not working in this setting, Balanced Softmax is still a valuable method in this area.
I found there is a config for BalanceSoftmax+Decouple Training in CIFAR-10-LT and it trains a classifier with BS based on the feature extractor from standard training. However, I cannot replicate the result reported in the paper using the config. Is there any different?