sgvaze / osr_closed_set_all_you_need

Official repo for our ICLR 22 Oral paper: "Open-Set Recognition: a Good Closed-Set Classifier is All You Need?"
MIT License
267 stars 32 forks source link

Open-Set Recognition: a Good Closed-Set Classifier is All You Need?

Code for our paper: Open-Set Recognition: a Good Closed-Set Classifier is All You Need?

We tackle open-set recognition: the task of detecting if a test sample comes from an unseen class (which the model did not see during training). We find a simple baseline of training a regular closed-set classifier as well as possible, and using the 'maximum logit score' (MLS) as an open-set indicator, can achieve SoTA on a number of evaluations. We also propose the Semantic Shift Benchmark for open-set recognition and related tasks.

image

Contents

:boom: 1. Updates

:globe_with_meridians: 2. The Semantic Shift Benchmark

:running: 3. Running

:chart_with_upwards_trend: 4. Hyper-parameters

:clipboard: 5. Citation

:boom: Updates

Updates to paper since pre-print

Repo updates (18/01/2022)

:globe_with_meridians: The Semantic Shift Benchmark

Download instructions for the datasets in the SSB can be found at the links below. The folder data/open_set_splits contains pickle files with the class splits. For each dataset, data contains functions which return PyTorch datasets containing 'seen' and 'unseen' classes according to the SSB splits. For the FGVC datasets, the pickle files also include information on which unseen classes are most similar to which seen classes.

Links for the legacy open-set datasets are also available at the links below:

For TinyImageNet, you also need to run create_val_img_folder in data/tinyimagenet.py to create a directory with the test data.

:running: Running

Dependencies

pip install -r requirements.txt

Config

Set paths to datasets and pre-trained models (for fine-grained experiments) in config.py

Set SAVE_DIR (logfile destination) and PYTHON (path to python interpreter) in bash_scripts scripts.

Scripts

Train models: To train models on all splits on a specified dataset (using tuned hyper-parameters from the paper), run:

bash bash_scripts/osr_train.sh

Evaluating models: Models can be evaluated by editing exp_ids in methods/tests/openset_test.py. The experiment IDs are printed in the Namespace at the top of each log file.

Pre-trained models: Pre-trained weights for the MLS baseline on the five TinyImageNet splits can be found in pretrained_weights/. The models should achieve an average of 84.2% accuracy on the test-sets of the closed-set classes (across the five splits) and an average 83.0% AUROC on the open-set detection task. Models are all VGG32 and use this image normalization at test-time with image_size=64.

:chart_with_upwards_trend: Optimal Hyper-parameters:

We tuned label smoothing and RandAug hyper-parameters to optimise closed-set accuracy on a single random validation split for each dataset. For other hyper-parameters (image size, batch size, learning rate) we took values from the open-set literature for the standard datasets (specifically, the ARPL paper) and values from the FGVC literature for the proposed FGVC benchmarks.

Cross-Entropy optimal hyper-parameters:

Dataset Image Size Learning Rate RandAug N RandAug M Label Smoothing Batch Size
MNIST 32 0.1 1 8 0.0 128
SVHN 32 0.1 1 18 0.0 128
CIFAR-10 32 0.1 1 6 0.0 128
CIFAR + N 32 0.1 1 6 0.0 128
TinyImageNet 64 0.01 1 9 0.9 128
CUB 448 0.001 2 30 0.3 32
FGVC-Aircraft 448 0.001 2 15 0.2 32

ARPL + CS optimal hyper-parameters:

(Note the lower learning rate for TinyImageNet)

Dataset Image Size Learning Rate RandAug N RandAug M Label Smoothing Batch Size
MNIST 32 0.1 1 8 0.0 128
SVHN 32 0.1 1 18 0.0 128
CIFAR10 32 0.1 1 15 0.0 128
CIFAR + N 32 0.1 1 6 0.0 128
TinyImageNet 64 0.001 1 9 0.9 128
CUB 448 0.001 2 30 0.2 32
FGVC-Aircraft 448 0.001 2 18 0.1 32

Other

This repo also contains other useful utilities, including:

  • utils/logfile_parser.py: To directly parse stdout outputs for Accuracy / AUROC metrics
  • data/open_set_datasets.py: A useful framework for easily splitting existing datasets into controllable open-set splits into train, val, test_known and test_unknown. Note: ImageNet has not yet been integrated here.
  • utils/schedulers.py: Implementation of Cosine Warm Restarts with linear rampup as a PyTorch learning rate scheduler

:clipboard: Citation

If you use this code in your research, please consider citing our paper:

@InProceedings{vaze2022openset,
               title={Open-Set Recognition: a Good Closed-Set Classifier is All You Need?},
               author={Sagar Vaze and Kai Han and Andrea Vedaldi and Andrew Zisserman},
               booktitle={International Conference on Learning Representations},
               year={2022}}

Furthermore, please also consider citing Adversarial Reciprocal Points Learning for Open Set Recognition, upon whose code we build this repo.