cooldown epochs - Githubissues

Westlake-AI / MogaNet

[ICLR 2024] MogaNet: Efficient Multi-order Gated Aggregation Network

Apache License 2.0

162 stars 13 forks source link

Hi, @dhkim0225, thanks for your question! The "cooldown_epochs" is not the necessary setup for training MogaNet, and we also provided 300-epoch implementations and results in OpenMixup. Actually, the "cooldown_epochs" implemented in Timm is the default training setup as the image classification implementation was migrated from PoolFormer, which has little effect on the final performance. It might be useful to some Transformer architectures, e.g., Uniformer. To my knowledge, whether to use 300 or 310 epochs training has little to do with whether to post manuscripts on visual architectures on OpenReview,

Westlake-AI / MogaNet

cooldown epochs #13