samuelstevens / swin-transformer

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
https://arxiv.org/abs/2103.14030
MIT License
0 stars 0 forks source link

Swin Transformer

Link to original Swin Transformer project

Installation Instructions

  1. Set up python packages
python -m venv venv
# Activate your virtual environment somehow
source venv/bin/activate.fish 

CUDA 11.6

pip install torch==1.12.1+cu116 torchvision==0.13.1+cu116 torchaudio==0.12.1+cu116 --extra-index-url https://download.pytorch.org/whl/cu116

CUDA 11.3

pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1+cu113 --extra-index-url https://download.pytorch.org/whl/cu113

Python packages

pip install matplotlib yacs timm einops black isort flake8 flake8-bugbear termcolor wandb preface opencv-python
  1. Install Apex

Apex is not needed if you do not want to use fp16.

git clone https://github.com/NVIDIA/apex.git
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
  1. Download Data

We use the iNat21 dataseta available on GitHub

cd /mnt/10tb
mkdir -p data/inat21
cd data/inat21
mkdir compressed raw
cd compressed
wget https://ml-inat-competition-datasets.s3.amazonaws.com/2021/train.tar.gz
wget https://ml-inat-competition-datasets.s3.amazonaws.com/2021/val.tar.gz

# pv is just a progress bar
pv val.tar.gz | tar -xz
mv val ../raw/  # if I knew how tar worked I could have it extract to raw/

pv train.tar.gz | tar -xz
mv train ../raw/
  1. Preprocess iNat 21

Use your root data folder and your size of choice.

export DATA_DIR=/mnt/10tb/data/inat21/
python -m data.inat preprocess $DATA_DIR val resize 192
python -m data.inat preprocess $DATA_DIR train resize 192
python -m data.inat preprocess $DATA_DIR val resize 256
python -m data.inat preprocess $DATA_DIR train resize 256
  1. Login to Wandb
wandb login
  1. Set up an env.fish file:

You need to provide $VENV and a $RUN_OUTPUT environment variables. I recommend using a file to save these variables.

In fish:

# scripts/env.fish
set -gx VENV venv
set -gx RUN_OUTPUT /mnt/10tb/models/hierarchical-vision

Then run source scripts/env.fish

AWS Helpers

Uninstall v1 of awscli:

sudo /usr/local/bin/pip uninstall awscli

Install v2:

cd ~/pkg
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
./aws/install --bin-dir ~/.local/bin --install-dir ~/.local/aws-cli

Download Downstream Data

NA Birds

Download the .tar.gz file from https://dl.allaboutbirds.org/nabirds.

wget https://www.dropbox.com/s/nf78cbxq6bxpcfc/nabirds.tar.gz
tar -xf nabirds.tar.gz

Run our script to generate the same stratified train/val split from the training data:

python -m src.tools.nabirds_stratified_split --input <nabirds-location> --output <output-directory> 

IP102

You can use rclone to download it directly from Google Drive. Otherwise check the Github: https://github.com/xpwu95/IP102

rclone copy gdrive:IP102_v1.1/Classification <local-directory> --drive-shared-with-me

Move the data:

python -m src.tools.ip102_preprocess --input <input-directory> --output <output-directory>