Closed SonwYang closed 1 year ago
According to the original paper, they pretrain for 100 epochs on ImageNet-21k.
Thank you! I have another question:What kind of learning rate schedule used in training model?
------------------ 原始邮件 ------------------ 发件人: "Ben @.>; 发送时间: 2022年8月4日(星期四) 下午5:59 收件人: @.>; 抄送: @.>; @.>; 主题: Re: [bwconrad/decoder-denoising] Question about implementation details (Issue #1)
According to the original paper, they pretrain for 100 epochs on ImageNet-21k.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
They train following a cosine annealing schedule which is what is used in this repo as well.
Hello, bro! Nice work! Can you tell me about the number of epoch in Denoising pretraining for decoder only?