Reproduce results - Githubissues

y0ast / deterministic-uncertainty-quantification

Code for "Uncertainty Estimation Using a Single Deep Deterministic Neural Network"

https://arxiv.org/abs/2003.02037

MIT License

268 stars 31 forks source link

Reproduce results #3

Closed ai4prod closed 3 years ago

ai4prod commented 3 years ago

Hi, thanks for sharing your work.

I'm trying to reproduce your results. I'm trying SVHN CIFAR10 results. I have trained your model and now I'm testing.

I produce 2 hitsograms about scores(kernel_distance) from CIFAR 10 and SVHN, but they are quite different from your paper. Accuracy score and AUROC is similar from the paper:

SVNH Accuracy,Auroc 0.9135159073448065 0.9238

get from

accuracy, auroc = get_cifar_svhn_ood(model)

CIFAR 10

0.9238 0.9070916430423466

get from function

accuracy, auroc = get_auroc_classification(test_dataset, model)

I have attached the 2 histograms

cifar10

svhn

Do you think this results are similar from yours?

Thanks

y0ast commented 3 years ago

Did you run the code in this repository and obtained those accuracies?

ai4prod commented 3 years ago

Hi @y0ast

yes i run the code in this repository. I run the code from train_duq_cifar.py with default parameter and then i have made an external script to load pretrained model. The script is the following:

model=ResNet_DUQ(32,10,512,512,0.1,0.999)
model.load_state_dict(torch.load("saved_models/DUQ_0.1__0.5_0.999_512_75.pt"))
model.eval()
model.cuda()

accuracy, auroc = get_cifar_svhn_ood(model)
print(auroc,accuracy)

#CIFAR

ds = all_datasets["CIFAR10"]()
input_size, num_classes, dataset, test_dataset = ds
accuracy, auroc = get_auroc_classification(test_dataset, model)

print(accuracy,auroc)

and inside loop_over_dataloader function i add


    plt.hist(scores, bins = [-1,-0.9,-0.8,-0.7,-0.6,-0.5,-0.4,-0.3,-0.2,-0.1]) 
    plt.title("svhn") 
    plt.show()

to print above histograms.

Have you some idea on how to debug more?

Thanks

y0ast commented 3 years ago

I'm mostly surprised by the difference in accuracy in your first results. These should be exactly the same. Also the accuracy is lower than expected. I'll make a note to double check my implementation after the ICML deadline. I'll also add a pre-trained model to the repo then.

y0ast commented 3 years ago

I looked into this, and found that the code is functioning as expected.

To obtain 94%+ accuracy, use --final_model as suggested in the readme To obtain CIFAR10 vs SVHN OoD AUROC, use get_auroc_ood

The function get_auroc_classification computes the area under ROC curve of "rejection classification" (see paper for more details) and can be used for model selection (it does not depend on OoD data).