PASS: Pictures without humAns for Self-Supervised Pretraining

TL;DR: An ImageNet replacement dataset for self-supervised pretraining without humans

Content

PASS is a large-scale image dataset that does not include any humans, human parts, or other personally identifiable information that can be used for high-quality pretraining while significantly reducing privacy concerns.

Download the dataset

The quickest way:

git clone https://github.com/yukimasano/PASS
cd PASS
source download.sh # maybe change the directory where you want to download it

Generally: all information is on our webpage.

For downloading the dataset, please visit our dataset on zenodo. There you can download it in tar files and find the meta-data.

You can also download the images from their AWS urls, from here.

Pretrained models

Pretraining	Method	Epochs	IN-1k Acc.	Places205 Acc.
(IN-1k)	MoCo-v2	200	60.6	50.1	visit MoCo-v2 repo
PASS	MoCo-v2	180	59.1	52.8	R50 weights
PASS	MoCo-v2	200	59.5	52.8	R50 weights
PASS	MoCo-v2	800	61.2	54.0	R50 weights
PASS	MoCo-v2 (R18)	800	45.3	44.4	R18 weights
PASS	MoCo-v2-CLD	200	60.2	53.1	R50 weights
PASS	SwAV	200	60.8	55.5	R50 weights
PASS	DINO	100	61.3	54.6	ViT S16 weights
PASS	DINO	300	65.0	55.7	ViT S16 weights

In the table above we give the download links to the full checkpoints (including momentum encoder etc.) to the models we've trained. For comparison, we include MoCo-v2 trained on ILSVRC-12 ("IN-1k") and report linear probing performance on IN-1k and Places205.

Pretrained models from PyTorch Hub

import torch
vits16_100ep = torch.hub.load('yukimasano/PASS:main', 'dino_100ep_vits16')
vits16 = torch.hub.load('yukimasano/PASS:main', 'dino_vits16')
r50_swav_200ep = torch.hub.load('yukimasano/PASS:main', 'swav_resnet50')
r50_moco_800ep = torch.hub.load('yukimasano/PASS:main', 'moco_resnet50')
r50_moco_cld_200ep = torch.hub.load('yukimasano/PASS:main', 'moco_cld_resnet50')

PASSify your dataset

In the folder PASSify of this repo, you can find automated scripts that try to remove humans from image datasets.

Contribute your models

Please let us know if you have a model pretrained on this dataset and I will add this to the list above.

Citation

@Article{asano21pass,
author = "Yuki M. Asano and Christian Rupprecht and Andrew Zisserman and Andrea Vedaldi",
title = "PASS: An ImageNet replacement for self-supervised pretraining without humans",
journal = "NeurIPS Track on Datasets and Benchmarks",
year = "2021"
}

yukimasano / PASS

readme