iCGY96 / ARPL

[TPAMI 2022] Adversarial Reciprocal Points Learning for Open Set Recognition
https://ieeexplore.ieee.org/document/9521769
MIT License
151 stars 27 forks source link

Questions regarding the baseline results #2

Closed Duconnor closed 3 years ago

Duconnor commented 3 years ago

Hi there! Thanks for your inspiring work and releasing the code. I have a small question regarding the baseline results. I did not modify the code and run it with command python osr.py --dataset cifar10 --loss Softmax. If I understand correctly, this would be the baseline method, and according to the Table 1 in your paper, the result AUROC should be 67.7 for the CIFAR10 dataset. However, the log I obtained is as follows:

,0,1,2,3,4
TNR,34.0,30.625000000000004,22.624999999999996,35.175,30.500000000000004
AUROC,86.9976125,85.59652499999999,84.34554583333333,86.77233749999999,86.72411666666667
DTACC,80.25416666666668,79.03333333333333,78.1625,79.575,79.97916666666667
AUIN,92.19165744328468,90.77076215295214,90.76942678194541,91.79811544137021,91.64555773527704
AUOUT,77.31935437681796,75.15246512319506,72.03227915144676,77.35292209454035,76.84887201588833
ACC,94.16666666666667,95.58333333333333,91.35,95.3,95.18333333333334
OSCR,84.46526250000002,84.14960000000002,80.82762291666654,84.88179583333343,84.94574374999975
unknown,"[0, 8, 3, 5]","[2, 3, 4, 5]","[0, 8, 2, 6]","[8, 2, 3, 5]","[8, 2, 3, 5]"
known,"[2, 4, 1, 7, 9, 6]","[8, 6, 1, 9, 0, 7]","[1, 5, 7, 3, 9, 4]","[7, 6, 4, 9, 0, 1]","[0, 6, 4, 9, 1, 7]"

And the average AUROC is about 86.09, which is significantly higher than the results reported. I'd like to know if there is anything that I haven't done properly. Thanks in advance!

iCGY96 commented 3 years ago

The AUROC results of the baseline in our paper refer to the paper [1] that first proposed this experiment. Our implementation for baseline is indeed better than the AUROC results in [1]. For this problem, please refer to the following issue. https://github.com/lwneal/counterfactual-open-set/issues/5

However, this result has been followed by many works. To reevaluate the baseline, we add the result of OSCR (a better metric) in the paper and evaluate all baseline methods. I hope it can help you.

[1] Open Set Learning with Counterfactual Images, ECCV 2018