facebookresearch / ConvNeXt

Code release for ConvNeXt model
MIT License
5.78k stars 696 forks source link

What's the difference between model and model_ema? #88

Closed leviome closed 2 years ago

leviome commented 2 years ago

why we need ema? what is ema?

liuzhuang13 commented 2 years ago

ema is sometimes useful in alleviate overfitting, especially on ImageNet-1K only training for large models. It is using the exponential moving average of weights instead of the current snapshot of weights. You can check the cited paper or the code for what it does.

leviome commented 2 years ago

"Exponential Moving Average", cool! thanks a lot!