BeierZhu / xERM

[AAAI 2022 Oral] This is a Pytorch implementation of the AAAI 2022 paper "Cross-Domain Empirical Risk Minimization for Unbiased Long-tailed Classification"
33 stars 7 forks source link

xERM-PC performance #2

Closed zuglerQ closed 2 years ago

zuglerQ commented 2 years ago

Dear Beier, I got a problem that I could not get your accuracy of 53.2 with xERM-PC code. My accuracy result is 0.527 with your config ce.yaml. The accuracies are : - Imbalanced Acc 0.461 Balanced Acc 0.513 xERM Acc 0.527 Could you give some hints about how to reproduce your result?

Best

BeierZhu commented 2 years ago

Hello, the xERM performance depends on the performance of balance model. I just test my Balanced model accuracy, which gives 51.6. I think it is randomness makes difference, and the worse PC (51.3 vs 51.6) will produce worse xERM-PC (52.7 vs 53.2)

Here is my log: -------------------------------------------------------------------------------------------- =====> No sampler. =====> Shuffle is False. Using 4 GPUs. Loading Dot Product Classifier. Loading Scratch ResNext 50 Feature Model. ======> Last ReLU: True ===> Saving cfg parameters to: ./exp_results/ImageNet_LT/resnext50_cross_entropy_e90/cfg.yaml Validation on the best model. Loading model from ./exp_results/ImageNet_LT/resnext50_cross_entropy_e90/final_model_checkpoint.pth ============> Load Moving Average <=========== =====> All keys in weights have been loaded to the module classifier ============> Load Moving Average <=========== =====> All keys in weights have been loaded to the module feat_model Phase: test 0%| | 0/453 [00:00<?, ?it/s]/home/beier/anaconda3/lib/python3.6/site-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at /pytorch/c10/core/TensorImpl.h:1156.) return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode) 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 453/453 [09:48<00:00, 1.30s/it] 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 79/79 [02:03<00:00, 1.56s/it] 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 196/196 [04:12<00:00, 1.29s/it] 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 196/196 [00:34<00:00, 5.71it/s]

Phase: test

Evaluation_accuracy_micro_top1: 0.516 Averaged F-measure: 0.503 Many_shot_accuracy_top1: 0.626 Median_shot_accuracy_top1: 0.489 Low_shot_accuracy_top1: 0.298

62.6 48.9 29.8 51.6 ========================= ALL COMPLETED =========================