Closed CYL0089 closed 1 year ago
Maybe I miss something?
Hi @CYL0089 , thanks for your attention to our work! The batch size per GPU is 128, and the number of GPUs is 32.
We did not try to enable SyncBN, but I think the batch size per GPU of 128 is enough for BatchNorm.
Hi @CYL0089 , thanks for your attention to our work! The batch size per GPU is 128, and the number of GPUs is 32.
We did not try to enable SyncBN, but I think the batch size per GPU of 128 is enough for BatchNorm.
I see, thank you
In the paper:
but in the code:
and the pretrain and finetune command (in the TinyViT/docs/TRAINING.md) not add the '--use-sync-bn'
`python -m torch.distributed.launch --master_addr=$MASTER_ADDR --nproc_per_node 8 --nnodes=4 --node_rank=$NODE_RANK main.py --cfg configs/22k_distill/tiny_vit_21m_22k_distill.yaml --data-path ./ImageNet-22k --batch-size 128 --output ./output --opts DISTILL.TEACHER_LOGITS_PATH ./teacher_logits/
python -m torch.distributed.launch --nproc_per_node 8 main.py --cfg configs/22kto1k/tiny_vit_21m_22kto1k.yaml --data-path ./ImageNet --batch-size 128 --pretrained ./checkpoints/tiny_vit_21m_22k_distill.pth --output ./output`
So the batch-size is 128 actually?