General Question regarding reset running statistics of BN

Hi👋, thanks for using our code!

Using EMA or not does make some difference, so I took a look at the source code. The implementation here follows the source code from function in AttentiveNAS, and I have checked the similar approach using in the Once-for-All (OFA) network. The same code is found in code from OFA and code from AttentiveNAS. Both of them are migrated from the official implementation of PyTorch, but AttentiveNAS comments it out and does not implement such process, which leads to the unimplemented EMA.

However, this line of the DynamicBN function in AttentiveNAS also leaves a comment and some related links. But I couldn't find a specific reason for not using momentum in the linked paper and webpage. You can check out these links and we would be grateful to be able to get your feedback.

Best regards.

xfey / pytorch-BigNAS

General Question regarding reset running statistics of BN #3