chmxu / eTT_TMLR2022

20 stars 2 forks source link

关于在ImageNet1K的train set的训练超参数 #7

Closed yiyuyuyi closed 10 months ago

yiyuyuyi commented 10 months ago
   尊敬的作者,您好!感谢您开源这篇富有创造力的工作。最近想在您工作的基础上做一些尝试,请问是否方便明确一下您在使用DINO进行预训练时,ViT-Small和ViT-Tiny使用的训练超参数(特别是batch size 和 epoch)都是怎样的呢?是都采用您代码的默认设置吗?大概使用多少张多大显存的GPU能完成呢?此外,您论文中还提到,尝试过用DeiT进行过微调,微调时学习率和epoch也都是严格仿照DeiT原论文吗?
  期待您能为我解惑,将不胜感激!
chmxu commented 10 months ago

Hi, we follow the original hyper-parameters of ViT-S to pretrain ViT-S and ViT-T, which should be the same as DeiT. I cannot remember the detail but I think 4 or 8 gpus are ok for this process. We did not try DeiT.

yiyuyuyi commented 10 months ago

Hi, we follow the original hyper-parameters of ViT-S to pretrain ViT-S and ViT-T, which should be the same as DeiT. I cannot remember the detail but I think 4 or 8 gpus are ok for this process. We did not try DeiT.

您好!感谢回复!您可能看错了,我想了解的是您在训练DNIO时的超参数,您论文中默认的epoch是100,batch size 为 512,而DINO论文作者所给出的代码中,epoch设置有三中,100,300, 或者 800, batch size 为 512 或1024,您用的是 epoch = 100, batch size = 512吗?

chmxu commented 10 months ago

We use 100 epoch, 512 batch size.

yiyuyuyi commented 10 months ago

We use 100 epoch, 512 batch size.

好的,十分感谢!