Closed AlanPeng0897 closed 1 year ago
Thanks for the comments! As the ID buffer always stores the same ID points, for the OOD buffer, there exists a tradeoff when deciding the number of data kept in the buffer after it is full. The learned decision boundary may not remain stable if data points in the buffer are changed frequently. However, when the batch size is not large, it is reasonable to gradually replace the old data in the buffer by adjusting start_idx for every iteration. We will give a try and check the performance. Please stay tuned!
We ran POEM with start_idx updated every iteration after the buffer is full, and the results look similar to the original version. Codebase is updated :) and here are some new results.
CIFAR-10
We ran the train_poem.py script with default hyperparameters (e.g., sigma_n 20, sigma 0.5) for 100 epochs. Places365 denotes the random subset of 10,000 examples.
Places365 FPR95: 2.37 AUROC: 99.23 AUPR: 99.33
LSUN
FPR95: 15.17
AUROC: 97.03
AUPR: 97.54
LSUN_resize FPR95: 0.00 AUROC: 100.00 AUPR: 100.00
iSUN
FPR95: 0.00
AUROC: 100.00
AUPR: 100.00
Textures FPR95: 0.20 AUROC: 99.87 AUPR: 99.93
SVHN
FPR95: 0.82
AUROC: 99.35
AUPR: 99.52
Avg FPR95: 3.09 Avg AUROC: 99.3 Avg AUPR: 99.4
CIFAR-100
We train with default hyperparameters (e.g., sigma_n 20, sigma 0.5) for 200 epochs. We found that the performance is relatively stable (not sensitive to posterior update hyperparams) when trained longer with random replacement of OOD samples per iteration. Places365 denotes the random subset of 10,000 examples.
sigma_n 20, sigma 0.5 places365 FPR95: 10.44 AUROC: 97.74 AUPR: 97.88 LSUN FPR95: 49.75 AUROC: 92.56 AUPR: 94.13 LSUN_resize FPR95: 0.00 AUROC: 100.00 AUPR: 100.00 iSUN FPR95: 0.00 AUROC: 99.99 AUPR: 99.99 Textures FPR95: 3.00 AUROC: 99.05 AUPR: 99.41 SVHN FPR95: 8.68 AUROC: 98.21 AUPR: 98.30 Avg FPR95: 11.98 Avg AUROC: 0.9793 Avg AUPR: 0.9828
sigma_n 20, sigma 0.1 places365 FPR95: 10.64 AUROC: 97.76 AUPR: 97.98 LSUN FPR95: 41.88 AUROC: 93.56 AUPR: 94.71 LSUN_resize FPR95: 0.00 AUROC: 100.00 AUPR: 100.00 iSUN FPR95: 0.00 AUROC: 100.00 AUPR: 100.00 dtd FPR95: 3.23 AUROC: 98.98 AUPR: 99.39 SVHN FPR95: 8.30 AUROC: 98.35 AUPR: 98.55 Avg FPR95: 10.67 Avg AUROC: 0.9811 Avg AUPR: 0.9844
sigma_n 5, sigma 0.5 places365 FPR95: 12.80 AUROC: 97.23 AUPR: 97.38 LSUN FPR95: 53.14 AUROC: 91.48 AUPR: 93.39 LSUN_resize FPR95: 0.00 AUROC: 100.00 AUPR: 100.00 iSUN FPR95: 0.00 AUROC: 100.00 AUPR: 100.00 dtd FPR95: 4.01 AUROC: 98.66 AUPR: 99.02 SVHN FPR95: 9.43 AUROC: 97.97 AUPR: 98.09 Avg FPR95: 13.23 Avg AUROC: 0.9756 Avg AUPR: 0.9798
sigma_n 5, sigma 0.1 places365 FPR95: 10.94 AUROC: 97.56 AUPR: 97.71 LSUN FPR95: 50.62 AUROC: 91.55 AUPR: 93.18 LSUN_resize FPR95: 0.00 AUROC: 100.00 AUPR: 100.00 iSUN FPR95: 0.00 AUROC: 100.00 AUPR: 100.00 dtd FPR95: 3.60 AUROC: 98.86 AUPR: 99.36 SVHN FPR95: 10.42 AUROC: 97.87 AUPR: 97.98 Avg FPR95: 12.6 Avg AUROC: 0.9764 Avg AUPR: 0.9804
sigma_n 10, sigma 0.5 places365 FPR95: 11.04 AUROC: 97.61 AUPR: 97.73 LSUN FPR95: 63.54 AUROC: 90.58 AUPR: 92.99 LSUN_resize FPR95: 0.00 AUROC: 100.00 AUPR: 100.00 iSUN FPR95: 0.00 AUROC: 99.99 AUPR: 99.99 dtd FPR95: 3.33 AUROC: 98.91 AUPR: 99.40 SVHN FPR95: 8.29 AUROC: 98.30 AUPR: 98.51 Avg FPR95: 14.37 Avg AUROC: 0.9757 Avg AUPR: 0.981
Whether "start_idx" should be updated? (!https://github.com/deeplearning-wisc/poem/blob/main/neural_linear_opt.py#L94-L95)