Code for our paper: Open-Set Recognition: a Good Closed-Set Classifier is All You Need?
We tackle open-set recognition: the task of detecting if a test sample comes from an unseen class (which the model did not see during training). We find a simple baseline of training a regular closed-set classifier as well as possible, and using the 'maximum logit score' (MLS) as an open-set indicator, can achieve SoTA on a number of evaluations. We also propose the Semantic Shift Benchmark for open-set recognition and related tasks.
:globe_with_meridians: 2. The Semantic Shift Benchmark
:chart_with_upwards_trend: 4. Hyper-parameters
pretrained_weights
.bash_scripts/osr_train.sh
to train models on all splits for a given dataset using the tuned hyper-parameters from the paper.Download instructions for the datasets in the SSB can be found at the links below. The folder data/open_set_splits
contains pickle files with the class splits. For each dataset, data
contains functions which return PyTorch datasets containing 'seen' and 'unseen' classes according to the SSB splits. For the FGVC datasets, the pickle files also include information on which unseen classes are most similar to which seen classes.
Links for the legacy open-set datasets are also available at the links below:
For TinyImageNet, you also need to run create_val_img_folder
in data/tinyimagenet.py
to create
a directory with the test data.
pip install -r requirements.txt
Set paths to datasets and pre-trained models (for fine-grained experiments) in config.py
Set SAVE_DIR
(logfile destination) and PYTHON
(path to python interpreter) in bash_scripts
scripts.
Train models: To train models on all splits on a specified dataset (using tuned hyper-parameters from the paper), run:
bash bash_scripts/osr_train.sh
Evaluating models: Models can be evaluated by editing exp_ids
in methods/tests/openset_test.py
. The experiment IDs are printed in the Namespace
at the top of each log file.
Pre-trained models: Pre-trained weights for the MLS baseline on the five TinyImageNet splits can be found in pretrained_weights/
. The models should achieve an average of 84.2% accuracy on the test-sets of the closed-set classes (across the five splits) and an average 83.0% AUROC on the open-set detection task. Models are all VGG32 and use this image normalization at test-time with image_size=64
.
We tuned label smoothing and RandAug hyper-parameters to optimise closed-set accuracy on a single random validation split for each dataset. For other hyper-parameters (image size, batch size, learning rate) we took values from the open-set literature for the standard datasets (specifically, the ARPL paper) and values from the FGVC literature for the proposed FGVC benchmarks.
Cross-Entropy optimal hyper-parameters:
Dataset | Image Size | Learning Rate | RandAug N | RandAug M | Label Smoothing | Batch Size |
---|---|---|---|---|---|---|
MNIST | 32 | 0.1 | 1 | 8 | 0.0 | 128 |
SVHN | 32 | 0.1 | 1 | 18 | 0.0 | 128 |
CIFAR-10 | 32 | 0.1 | 1 | 6 | 0.0 | 128 |
CIFAR + N | 32 | 0.1 | 1 | 6 | 0.0 | 128 |
TinyImageNet | 64 | 0.01 | 1 | 9 | 0.9 | 128 |
CUB | 448 | 0.001 | 2 | 30 | 0.3 | 32 |
FGVC-Aircraft | 448 | 0.001 | 2 | 15 | 0.2 | 32 |
ARPL + CS optimal hyper-parameters:
(Note the lower learning rate for TinyImageNet)
Dataset | Image Size | Learning Rate | RandAug N | RandAug M | Label Smoothing | Batch Size |
---|---|---|---|---|---|---|
MNIST | 32 | 0.1 | 1 | 8 | 0.0 | 128 |
SVHN | 32 | 0.1 | 1 | 18 | 0.0 | 128 |
CIFAR10 | 32 | 0.1 | 1 | 15 | 0.0 | 128 |
CIFAR + N | 32 | 0.1 | 1 | 6 | 0.0 | 128 |
TinyImageNet | 64 | 0.001 | 1 | 9 | 0.9 | 128 |
CUB | 448 | 0.001 | 2 | 30 | 0.2 | 32 |
FGVC-Aircraft | 448 | 0.001 | 2 | 18 | 0.1 | 32 |
This repo also contains other useful utilities, including:
utils/logfile_parser.py
: To directly parse stdout
outputs for Accuracy / AUROC metricsdata/open_set_datasets.py
: A useful framework for easily splitting existing datasets into controllable open-set splits
into train
, val
, test_known
and test_unknown
. Note: ImageNet has not yet been integrated here.utils/schedulers.py
: Implementation of Cosine Warm Restarts with linear rampup as a PyTorch learning rate schedulerIf you use this code in your research, please consider citing our paper:
@InProceedings{vaze2022openset,
title={Open-Set Recognition: a Good Closed-Set Classifier is All You Need?},
author={Sagar Vaze and Kai Han and Andrea Vedaldi and Andrew Zisserman},
booktitle={International Conference on Learning Representations},
year={2022}}
Furthermore, please also consider citing Adversarial Reciprocal Points Learning for Open Set Recognition, upon whose code we build this repo.