This repository contains PyTorch code and pretrained models of our paper: BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search (ICCV 2021).
Illustration of the Siamese supernets training with ensemble bootstrapping.
Illustration of the fabric-like Hybrid CNN-transformer Search Space with flexible down-sampling positions.
Here is a summary of our searched models:
Model | MAdds | Steptime | Top-1 (%) | Top-5 (%) | Url |
---|---|---|---|---|---|
BossNet-T0 w/o SE | 3.4B | 101ms | 80.5 | 95.0 | checkpoint |
BossNet-T0 | 3.4B | 115ms | 80.8 | 95.2 | checkpoint |
BossNet-T0^ | 5.7B | 147ms | 81.6 | 95.6 | same as above |
BossNet-T1 | 7.9B | 156ms | 81.9 | 95.6 | checkpoint |
BossNet-T1^ | 10.5B | 165ms | 82.2 | 95.7 | same as above |
Here is a summary of architecture rating accuracy of our method:
Search space | Dataset | Kendall tau | Spearman rho | Pearson R |
---|---|---|---|---|
MBConv | ImageNet | 0.65 | 0.78 | 0.85 |
NATS-Bench Ss | Cifar10 | 0.53 | 0.73 | 0.72 |
NATS-Bench Ss | Cifar100 | 0.59 | 0.76 | 0.79 |
Linux
Python 3.5+
CUDA 9.0 or higher
NCCL 2
GCC 4.9 or higher
Install PyTorch 1.7.0+ and torchvision 0.8.1+, for example:
conda install -c pytorch pytorch torchvision
Install [Apex](), for example:
git clone https://github.com/NVIDIA/apex.git
cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
Install pytorch-image-models 0.3.2, for example:
pip install timm==0.3.2
Install OpenSelfSup. As the original OpenSelfSup can not be installed as a site-package, please install our forked and modified version, for example:
git clone https://github.com/changlin31/OpenSelfSup.git
cd OpenSelfSup
pip install -v --no-cache-dir .
ImageNet & meta files
/YOURDATAROOT/imagenet/
Download NATS-Bench split version CIFAR datasets from Google Drive. Put it under /YOURDATAROOT/cifar/
Prepare BossNAS repository:
git clone https://github.com/changlin31/BossNAS.git
cd BossNAS
ln -s /YOURDATAROOT data
BossNAS
├── ranking_mbconv
├── ranking_nats
├── retraining_hytra
├── searching
├── data
│ ├── imagenet
│ │ ├── meta
│ │ ├── train
│ │ | ├── n01440764
│ │ | ├── n01443537
│ │ | ├── ...
│ │ ├── val
│ │ | ├── n01440764
│ │ | ├── n01443537
│ │ | ├── ...
│ ├── cifar
│ │ ├── cifar-10-batches-py
│ │ ├── cifar-100-python
First, move to retraining code directory to perform Retraining or Evaluation.
cd retraining_hytra
Our retraining code of BossNet-T is based on DeiT repository.
Evaluate our BossNet-T models with the following command:
--resume
and --input-size
accordingly. You can change the --nproc_per_node
option to suit your GPU numberspython -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --model bossnet_T0 --input-size 224 --batch-size 128 --data-path ../data/imagenet --num_workers 8 --eval --resume PATH/TO/BossNet-T0-80_8.pth
python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --model bossnet_T1 --input-size 224 --batch-size 128 --data-path ../data/imagenet --num_workers 8 --eval --resume PATH/TO/BossNet-T1-81_9.pth
Retrain our BossNet-T models with the following command:
--nproc_per_node
to suit your GPU numbers. Please note that the learning rate will be automatically scaled according to the GPU numbers and batchsize. We recommend training with 128 batchsize and 8 GPUs. (takes about 2 days)python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --model bossnet_T0 --input-size 224 --batch-size 128 --data-path ../data/imagenet --num_workers 8
python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --model bossnet_T1 --input-size 224 --batch-size 128 --data-path ../data/imagenet --num_workers 8
Calculate the MAdds for BossNet-T models with the following command:
python retraining_hytra/boss_madds.py
Architecture of our BossNet-T0
Get the ranking correlations of BossNAS on MBConv search space with the following commands:
cd ranking_mbconv
python get_model_score_mbconv.py
cd ranking_nats
python get_model_score_nats.py
First, go to the searching code directory:
cd searching
Search in NATS-Bench Ss Search Space on CIFAR datasets (4 GPUs, 3 hrs)
bash dist_train.sh configs/nats_c10_bs256_accumulate4_gpus4.py 4
bash dist_train.sh configs/nats_c100_bs256_accumulate4_gpus4.py 4
Search in MBConv Search Space on ImageNet (8 GPUs, 1.5 days)
bash dist_train.sh configs/mbconv_bs64_accumulate8_ep6_multi_aug_gpus8.py 8
Search in HyTra Search Space on ImageNet (8 GPUs, 4 days, memory requirement: 24G)
bash dist_train.sh configs/hytra_bs64_accumulate8_ep6_multi_aug_gpus8.py 8
If you use our code for your paper, please cite:
@inproceedings{li2021bossnas,
author = {Li, Changlin and
Tang, Tao and
Wang, Guangrun and
Peng, Jiefeng and
Wang, Bing and
Liang, Xiaodan and
Chang, Xiaojun},
title = {{B}oss{NAS}: Exploring Hybrid {CNN}-transformers with Block-wisely Self-supervised Neural Architecture Search},
booktitle = {ICCV},
year = 2021,
}