AlexMeinke / certified-certain-uncertainty

A way to achieve uniform confidence far away from the training data.
36 stars 3 forks source link

Question about Gaussian mixture model #2

Closed e96031413 closed 2 years ago

e96031413 commented 2 years ago

Hi, since 80 Million Tiny Images is removed by MIT, how can I initialize a Gaussian mixture model?

AlexMeinke commented 2 years ago

Thanks for the question.

You can use the gen_gmm.py script and simply replace the out_loader in line 76 by whatever alternative you would like to use.

e96031413 commented 2 years ago

Thanks for your reply.

By the way, I have a dataset, which contains 1.36 million rgb images and the image shape is 224x224, I would like to make sure my model know when it don't know.

I found that when I tried to use the following command to initialize Gaussian mixture model:

python gen_gmm.py --dataset MyOwnDatasetName --PCA 1 --augm_flag 1

The terminal outputted the following error:

RuntimeError: DataLoader worker (pid XXXXX) is killed by signal: Killed

I think this error is caused by the big size of my dataset(number of images and the image shape) so that my RAM is not big enough to fit the entire dataset.

Do you think I can still adpot Gaussian mixture model in my setting? or should I just train my model with PGD attack to improve the robutsness of my model?

AlexMeinke commented 2 years ago

Using this many higher resolution images is certainly infeasible using the current implementation of how the GMM is fitted (using scikit-learn). In principle you could get around this by either simply reducing the number of samples on which you fit the GMM (using the data_used flag for in-distribution and adjusting the number of batches in line 81 for OOD), or by fitting the GMM using a more memory efficient algorithm, i.e. one that doesn't load all data in RAM first. Either way, once the GMM is initialized, the scale of the data should not be an issue anymore.

When you say that you would like to improve robustness, which notion are you referring to exactly? Would you like robustness on the classification task? Or rather low confidence on OOD samples that retain low confidence even after adversarial manipulation? In the former case CCU will not help you anyway. In the latter case it will help you, but potentially only on far OOD samples. If I can shamelessly plug our more recent approach to solving overconfidence on adversarial out-distribution samples, you may want to check out Provably Robust Detection of Out-of-distribution Data (almost) for free, which is much cheaper to train for robust OOD detection (than using PGD) and allows for more reliable evaluation by issuing robustness certificates (which adversarial training can't). It's also already known to work at 224x224 resolution.

e96031413 commented 2 years ago

Thanks for the detailed explanation.

I would like to go with "low confidence on OOD samples that retain low confidence even after adversarial manipulation".

I will also try "Provably Robust Detection of Out-of-distribution Data (almost) for free" since my task require faster training time and inference time.

Thanks you for the great work.