Closed philipperemy closed 4 years ago
For example, for SVHN SDENET / OOD CIFAR
, we get AUPR Out: 84.017%
but paper is 93.7±0.9
.
I understand that it cannot be 93.7 exactly since it's an average but 84% seems quite low (same as the other methods).
Model is save_sdenet_svhn
The performance should be slightly better than the table using the parameters in the repo. It is strange that your OOD is so low for SVHN. I change the parameter of SVHN datasets to the original values. Let me know it still doesn't work for you on SVHN.
ok it looks better now I guess. we have 91.7% and the paper quotes 93.7±0.9.
Load model
load target data: svhn
Building SVHN data loader with 1 workers
Using downloaded and verified file: ../data/svhn/train_32x32.mat
Using downloaded and verified file: ../data/svhn/test_32x32.mat
load non target data: cifar10
Building CIFAR-10 data loader with 1 workers
Files already downloaded and verified
Files already downloaded and verified
generate log from in-distribution data
Final Accuracy: 24345/25856 (94.16%)
generate log from out-of-distribution data
calculate metrics for OOD
OOD Performance of Baseline detector
TNR at TPR 95%: 83.940%
AUROC: 97.134%
Detection acc: 92.115%
AUPR In: 98.848%
AUPR Out: 91.691%
calculate metrics for mis
mis Performance of Baseline detector
TNR at TPR 95%: 66.844%
AUROC: 92.667%
Detection acc: 87.156%
AUPR In: 99.357%
AUPR Out: 54.045%
Seems work now. It should be just the variance of the results. I just ran the model five times again. Below are my results:
run1:
Final Accuracy: 24413/25856 (94.42%)
generate log from out-of-distribution data calculate metrics for OOD OOD Performance of Baseline detector TNR at TPR 95%: 83.466% AUROC: 97.212% Detection acc: 92.173% AUPR In: 98.933% AUPR Out: 91.916% calculate metrics for mis mis Performance of Baseline detector TNR at TPR 95%: 67.344% AUROC: 92.087% Detection acc: 86.855% AUPR In: 99.336% AUPR Out: 52.726%
run2:
Final Accuracy: 24273/25856 (93.88%)
generate log from out-of-distribution data calculate metrics for OOD OOD Performance of Baseline detector TNR at TPR 95%: 87.640% AUROC: 97.761% Detection acc: 92.734% AUPR In: 99.153% AUPR Out: 93.575% calculate metrics for mis mis Performance of Baseline detector TNR at TPR 95%: 65.276% AUROC: 92.183% Detection acc: 86.330% AUPR In: 99.325% AUPR Out: 54.205%
run3: Final Accuracy: 24375/25856 (94.27%)
generate log from out-of-distribution data calculate metrics for OOD OOD Performance of Baseline detector TNR at TPR 95%: 92.325% AUROC: 98.500% Detection acc: 94.053% AUPR In: 99.410% AUPR Out: 95.651% calculate metrics for mis mis Performance of Baseline detector TNR at TPR 95%: 65.903% AUROC: 91.588% Detection acc: 85.871% AUPR In: 99.268% AUPR Out: 53.556%
run4:
Final Accuracy: 24416/25856 (94.43%)
generate log from out-of-distribution data calculate metrics for OOD OOD Performance of Baseline detector TNR at TPR 95%: 85.298% AUROC: 97.520% Detection acc: 92.393% AUPR In: 99.018% AUPR Out: 93.336% calculate metrics for mis mis Performance of Baseline detector TNR at TPR 95%: 66.203% AUROC: 91.960% Detection acc: 86.838% AUPR In: 99.314% AUPR Out: 52.331%
run5: Final Accuracy: 24345/25856 (94.16%)
generate log from out-of-distribution data calculate metrics for OOD OOD Performance of Baseline detector TNR at TPR 95%: 90.076% AUROC: 98.126% Detection acc: 93.207% AUPR In: 99.264% AUPR Out: 94.566% calculate metrics for mis mis Performance of Baseline detector TNR at TPR 95%: 66.815% AUROC: 92.440% Detection acc: 86.975% AUPR In: 99.357% AUPR Out: 54.796%
Average: Accuracy: 94.232 +- 0.226 TNR at TPR 95%: 87.761 +- 3.561 AUROC: 97.824 +- 0.505 Detection acc: 92.920 +- 0.738 AUPR In: 99.156 +- 0.190 AUPR Out: 93.908 +- 1.399
As you can see, not every single run can fall in the average +- std.
Yes looks good now thank you! @Lingkai-Kong
@Lingkai-Kong
I ran the python commands on the repo and could not find the results you quoted in the paper.
Of course it was just one run but the values seem to be too low (regarding the std deviation).