Hi,
In the moco-v3 paper there is a section about the computation time. It says that for the ViT-B, 100 epochs of imagenet take 2.1h hours. It is not clear if it 512 TPU devices or 512 TPU cores. To be precise, there are two types of TPUs available on google cloud: v2-[32,512] and v3-[32-2048]. Which one of them was used in the experiment and how many for each instance ?
Hi, In the moco-v3 paper there is a section about the computation time. It says that for the ViT-B, 100 epochs of imagenet take 2.1h hours. It is not clear if it 512 TPU devices or 512 TPU cores. To be precise, there are two types of TPUs available on google cloud: v2-[32,512] and v3-[32-2048]. Which one of them was used in the experiment and how many for each instance ?