facebookresearch / vissl

VISSL is FAIR's library of extensible, modular and scalable components for SOTA Self-Supervised Learning with images.
https://vissl.ai
MIT License
3.25k stars 332 forks source link

EMA does not work on fp16 and does not copy weights? #568

Open YannDubs opened 2 years ago

YannDubs commented 2 years ago

Hi,

Looking at the implementation of ModelEmaV2, it seems that compared to timm the model only works on fp32 parameters? (see this line ) Does it mean that it will not work if I use AMP ?

Furthermore, another difference with timm is that the ema_model is not copied (in timm copying is done here ). I am probably missing where the model is copied, can you point it to me please? (if the model is not copied then EMA simply corresponds to momentum)