TACJu / TransFG

This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).
MIT License
382 stars 88 forks source link

Batch_size is 16 or 64? #15

Open qilong-zhang opened 3 years ago

qilong-zhang commented 3 years ago

Hi @TACJu, I notice you apply DDP with 4 GPUs in train.py. Therefore, if the batch_size in args is set to 16, then the overall batch_size will be 16x4=64. However, in your paper, you say that the batch_size is 16. I also try batch_size 16x4 on Tesla V100, but OOM will be raised, so I wonder batch_size is 16 means 16 or 64? thanks!

slothfulxtx commented 2 years ago

Maybe the batch_size is 16*4 = 64. I run the code with batch_size=4*4 , and the accuracy on the CUB_200_2011 dataset is only 90.9\%. After changing the batch_size to 4*8 (limited by memory, 4*16 cause OOM on my server with 4 RTX3090 GPUs), the accuracy raises up to 91.4\%.