Closed FreddieRao closed 3 years ago
Hi there !
Thank you for bringing this up.
The small difference you observe is likely due to the hardware setup. In particular, we used 16 GPUs distributed over 2 nodes, which changes the seed of subprocesses via seed = args.seed + utils.get_rank()
.
If it is any help, we could share some log files.
Best,
Stéphane
Dear contributors,
Thanks for releasing your code. We used your codebase to run ConViT-Ti on the ImageNet dataset and achieved 72.5% Top-1 Accuracy, which is 0.6% lower than you reported. Could you please let us know how to reproduce your result? Here is our setting:
8 V100 GPUs nproc_per_node=8 batch-size=128 The other setting is the same as your main.py file. Here we upload the file for your reference:
Many thanks!