EfficientViT-Pytorch

EfficientViT: Memory Efficient Vision Transformer with Cascaded Group Attention

Precautions

Before you use the code to train your own data set, please first enter the _train_gpu.py__ file and modify the data_root, batch_size and nb_classes parameters. If you want to draw the confusion matrix and ROC curve, you only need to remove the comments of Plot_ROC and Predictor___ at the end of the code. For the third parameter, you should change it to the path of your own model weights file(.pth).

Use Sophia Optimizer (in util/optimizer.py)

You can use anther optimizer sophia, just need to change the optimizer in _traingpu.py, for this training sample, can achieve better results

# optimizer = create_optimizer(args, model_without_ddp)
optimizer = SophiaG(model.parameters(), lr=2e-4, betas=(0.965, 0.99), rho=0.01, weight_decay=args.weight_decay)

Train this model

train model with single-machine single-card：

python train_gpu.py

train model with single-machine multi-card：

python -m torch.distributed.launch --nproc_per_node=8 train_gpu.py

train model with single-machine multi-card:

(using a specified part of the cards: for example, I want to use the second and fourth cards)

CUDA_VISIBLE_DEVICES=1,3 python -m torch.distributed.launch --nproc_per_node=2 train_gpu.py

train model with multi-machine multi-card:

(For the specific number of GPUs on each machine, modify the value of --nproc_per_node. If you want to specify a certain card, just add CUDA_VISIBLE_DEVICES= to specify the index number of the card before each command. The principle is the same as single-machine multi-card training)

On the first machine: python -m torch.distributed.launch --nproc_per_node=1 --nnodes=2 --node_rank=0 --master_addr=<Master node IP address> --master_port=<Master node port number> train_gpu.py

On the second machine: python -m torch.distributed.launch --nproc_per_node=1 --nnodes=2 --node_rank=1 --master_addr=<Master node IP address> --master_port=<Master node port number> train_gpu.py

Citation

@inproceedings{liu2023efficientvit,
  title={EfficientViT: Memory Efficient Vision Transformer with Cascaded Group Attention},
  author={Liu, Xinyu and Peng, Houwen and Zheng, Ningxin and Yang, Yuqing and Hu, Han and Yuan, Yixuan},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={14420--14430},
  year={2023}
}

jiaowoguanren0615 / EfficientViT-Pytorch

readme