关于在ImageNet1K的train set的训练超参数

chmxu / eTT_TMLR2022

20 stars 2 forks source link

关于在ImageNet1K的train set的训练超参数 #7

Closed yiyuyuyi closed 10 months ago

yiyuyuyi commented 10 months ago

   尊敬的作者，您好！感谢您开源这篇富有创造力的工作。最近想在您工作的基础上做一些尝试，请问是否方便明确一下您在使用DINO进行预训练时，ViT-Small和ViT-Tiny使用的训练超参数(特别是batch size 和 epoch)都是怎样的呢？是都采用您代码的默认设置吗？大概使用多少张多大显存的GPU能完成呢？此外，您论文中还提到，尝试过用DeiT进行过微调，微调时学习率和epoch也都是严格仿照DeiT原论文吗？
  期待您能为我解惑，将不胜感激！

chmxu commented 10 months ago

Hi, we follow the original hyper-parameters of ViT-S to pretrain ViT-S and ViT-T, which should be the same as DeiT. I cannot remember the detail but I think 4 or 8 gpus are ok for this process. We did not try DeiT.

yiyuyuyi commented 10 months ago

Hi, we follow the original hyper-parameters of ViT-S to pretrain ViT-S and ViT-T, which should be the same as DeiT. I cannot remember the detail but I think 4 or 8 gpus are ok for this process. We did not try DeiT.

您好！感谢回复！您可能看错了，我想了解的是您在训练DNIO时的超参数，您论文中默认的epoch是100，batch size 为 512，而DINO论文作者所给出的代码中，epoch设置有三中，100，300，或者 800, batch size 为 512 或1024，您用的是 epoch = 100, batch size = 512吗？

chmxu commented 10 months ago

We use 100 epoch, 512 batch size.

yiyuyuyi commented 10 months ago

We use 100 epoch, 512 batch size.

好的，十分感谢！