facebookresearch / VICRegL

VICRegL official code base
Other
224 stars 24 forks source link

VICRegL: Self-Supervised Learning of Local Visual Features

This repository provides a PyTorch implementation and pretrained models for VICRegL, a self-supervsied pretraining method for learning global and local features, described in the paper VICRegL: Self-Supervised Learning of Local Visual Features, published to NeurIPS 2022.\ Adrien Bardes, Jean Ponce and Yann LeCun\ Meta AI, Inria


Pre-trained Models

You can choose to download only the weights of the pretrained backbone used for downstream tasks, or the full checkpoint which contains backbone and expander/projector weights. All the models are pretrained on ImageNet-1k, except the ConvNeXt-XL model which is pretrained on ImageNet-22k. linear cls. is the linear classification accuracy on the validation set of ImageNet, and linear seg. is the linear frozen mIoU on the validation set of Pascal VOC.

arch alpha params linear cls. (%) linear seg. (mIoU) download
ResNet-50 0.9 23M 71.2 54.0 backbone full ckpt logs
ResNet-50 0.75 23M 70.4 55.9 backbone full ckpt logs
ConvNeXt-S 0.9 50M 75.9 66.7 backbone full ckpt logs
ConvNeXt-S 0.75 50M 74.6 67.5 backbone full ckpt logs
ConvNeXt-B 0.9 85M 77.1 69.3 backbone full ckpt logs
ConvNeXt-B 0.75 85M 76.3 70.4 backbone full ckpt logs
ConvNeXt-XL 0.75 350M 79.4 78.7 backbone full ckpt logs

Pretrained models on PyTorch Hub

import torch
model = torch.hub.load('facebookresearch/vicregl:main', 'resnet50_alpha0p9')
model = torch.hub.load('facebookresearch/vicregl:main', 'resnet50_alpha0p75')
model = torch.hub.load('facebookresearch/vicregl:main', 'convnext_small_alpha0p9')
model = torch.hub.load('facebookresearch/vicregl:main', 'convnext_small_alpha0p75')
model = torch.hub.load('facebookresearch/vicregl:main', 'convnext_base_alpha0p9')
model = torch.hub.load('facebookresearch/vicregl:main', 'convnext_base_alpha0p75')
model = torch.hub.load('facebookresearch/vicregl:main', 'convnext_xlarge_alpha0p75')

Training

Install PyTorch (pytorch.org) and download ImageNet. The code has been developed for PyTorch version 1.8.1 and torchvision version 0.9.1, but should work with other versions just as well. Setup the ImageNet path in the file datasets.py:

IMAGENET_PATH = "path/to/imagenet"

ImageNet can also be loaded from numpy files, by setting the flag --dataset_from_numpy and setting the path:

IMAGENET_NUMPY_PATH = "path/to/imagenet/numpy/files"

The argument --alpha controls the weight between global and local loss, it is set by default to 0.75.

Single-node local training

To pretrain VICRegL with a ResNet-50 backbone on a single node with 8 GPUs for 100 epochs, run:

python -m torch.distributed.launch --nproc_per_node=8 main_vicregl.py --fp16 --exp-dir /path/to/experiment/ --arch resnet50 --epochs 100 --batch-size 512 --optimizer lars --base-lr 0.3 --weight-decay 1e-06 --size-crops 224 --num-crops 2 --min_scale_crops 0.08 --max_scale_crops 1.0 --alpha 0.75

To pretrain VICRegL with a ConvNeXt-S backbone, run:

python -m torch.distributed.launch --nproc_per_node=8 main_vicregl.py --fp16 --exp-dir /path/to/experiment/ --arch convnext_small --epochs 100 --batch-size 384 --optimizer adamw --base-lr 0.00075 --alpha 0.75

Multi-node training with SLURM

To pretrain VICRegL with a ResNet-50 backbone, with submitit (pip install submitit) and SLURM on 4 nodes with 8 GPUs each for 300 epochs, run:

python run_with_submitit.py --nodes 4 --ngpus 8 --fp16 --exp-dir /path/to/experiment/ --arch resnet50 --epochs 300 --batch-size 2048 --optimizer lars --base-lr 0.2 --weight-decay 1e-06 --size-crops 224 --num-crops 2 --min_scale_crops 0.08 --max_scale_crops 1.0 --alpha 0.75

To pretrain VICRegL with a ConvNeXt-B backbone, run:

python run_with_submitit.py --nodes 2 --ngpus 8 --fp16 --exp-dir /path/to/experiment/ --arch convnext_small --epochs 400 --batch-size 576 --optimizer adamw --base-lr 0.0005 --alpha 0.75

Evaluation

Linear evaluation

To evaluate a pretrained backbone (resnet50, convnext_small, convnext_base, convnext_xlarge) on linear classification on ImageNet, run:

python evaluate.py --data-dir /path/to/imagenet/ --pretrained /path/to/checkpoint/model.pth --exp-dir /path/to/experiment/ --arch [backbone] --lr-head [lr]

with lr=0.02 for resnets models and lr=0.3 for convnexts models.

Linear segmentation

See the segmentation folder.

License

This project is released under the CC-BY-NC License. See LICENSE for details.

Citation

If you find this repository useful, please consider giving a star :star: and citation:

@inproceedings{bardes2022vicregl,
  author  = {Adrien Bardes and Jean Ponce and Yann LeCun},
  title   = {VICRegL: Self-Supervised Learning of Local Visual Features},
  booktitle = {NeurIPS},
  year    = {2022},
}