Closed valencebond closed 4 years ago
Thanks for your questions!
We referred to ImageNet-LT [1-2] where the maximum class has 1,000 samples when constructing the dataset. It's indeed kind of small because there are only 80 classes, but further lifting the sample size of the head classes would either lead to an extremely imbalanced distribution, or we cannot perform strict limitations to the size of the tail classes. Notice that we use the widely adopted long-tailed setting with class-split manner for head/many-shot: >100 samples, medium-shot: 20-100 samples, and tail/few-shot: <20 samples.
Hope these details will help you~
[1] Kang et.al., Decoupling representation and classifier for long-tailed recognition. In: International Conference on Learning Representations. In ICLR 2020 [2] Liu et.al., Large-scale long-tailed recognition in an open world. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. In CVPR 2019 [3] Zhou et.al., BBN: Bilateral-branch network with cumulative learning for long-tailed visual recognition. In CVPR 2020
hi @wutong16, thanks for your explanation. For long-tailed distribution, my previous view more focus on relative numbers between various categories, i.e. if the number of one label A is only one percent of the other label B, then the label A is minority. But your point more concentrate on absolute number of a category. It truely makes sense.
it is a great work. thanks!
By the way,
class_aware_sample_generator
with num_samples_cls = 3, so one batch consists of tuples of 3 images with same target label ? For one gpu used, batchsize 32 is made up by 10 labels with 3 images and 1 labels with 2 images. it is a little weird. 2.for the self.num_sample
in ClassAwareSampler
, why we need reduce
params and set num_samples_cls=3, reduce=4?
is there some intuitive reason?
Hi!
Different settings of num_samples_cls
in a proper range would not influence the results a lot, as we tried 1,2,3, and 4. But yes, it's better to use an even number to avoid the situation you mentioned.
The parameter reduce
is to control the total number of samples in an epoch, since the imbalance and head-dominance(usually the class 'person') is severe, we don't want to take N_max*C
samples for each epoch which is too many and we have to further reduce the total epoch number. So we take N_max*C/reduce
samples alternatively, which may slightly down-sample one/two head classes. Similarly, a proper range(not too big) of reduce
won't influence the results too much as we tried, but it does make some little difference.
thanks for your detailed code!
max=1200 min=1
. i am confused about the params settings, the train set is too samll, why not try to set a larger max param to construct a larger train set.