Function of drop_path——drop_path应该设置多少，模型能提高map喃

facebookresearch / ConvNeXt

Code release for ConvNeXt model

MIT License

5.78k stars 696 forks source link

Function of drop_path——drop_path应该设置多少，模型能提高map喃 #100

Closed LUO77123 closed 2 years ago

LUO77123 commented 2 years ago

Since drop_path_rate is 0 by default, drop_path is not enabled.I want to know how much drop_path_rate is set, and the effect of the model will be better.thanks! 由于drop_path_rate默认为0，未启用drop_path，想知道将drop_path_rate设置为多少，模型的效果会好一点。谢谢！

liuzhuang13 commented 2 years ago

Hi,

The dp rates for different models can be found in paper or in this repo's training instructions. The optimal value depends on the model size and dataset used.

LUO77123 commented 2 years ago

Thank you. I found it in ConvNeXt paper " We conduct a lightweight sweep for COCO experiments including learning rate {1e-4, 2e-4}, layer-wise learning rate decay [6] {0.7, 0.8, 0.9, 0.95}, and stochastic depth rate {0.3, 0.4, 0.5, 0.6, 0.7, 0.8}. We fine-tune the ImageNet-22K pre-trained Swin-B/L on COCO using the same sweep. We use the official code and pre-trained model weights [3]. The hyperparameters we sweep for ADE20K experiments include learning rate {8e-5, 1e-4}, layer-wise learning rate decay {0.8, 0.9}, and stochastic depth rate {0.3, 0.4, 0.5}. We report validation mIoU results using multi-scale testing. Additional single-scale testing results are in Table 7. " For detection tasks, so do I refer to coco, but my input is 1024*1024，stochastic depth rate use {0.3, 0.4, 0.5, 0.6, 0.7, 0.8}？ Is this ok?

liuzhuang13 commented 2 years ago

Hello,

If your setting is the same as one of ours, you can refer to our config files for dp rates. If it's different, you may want to do a grid search. Generally larger models need larger dp rates.

LUO77123 commented 2 years ago

thanks🦀🦀

---原始邮件--- 发件人: "Zhuang @.> 发送时间: 2022年5月11日(周三) 中午1:17 收件人: @.>; 抄送: @.**@.>; 主题: Re: [facebookresearch/ConvNeXt] Function of drop_path——drop_path应该设置多少，模型能提高map喃 (Issue #100)

Hello,

If your setting is the same as one of ours, you can refer to our config files for dp rates. If it's different, you may want to do a grid search. Generally larger models need larger dp rates.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>