The EMA update is formulated as $model{EMA} = d \times model{EMA} + (1-d) \times model_{current}$, where $d=0.9999 \times (1- e^{-(\text{iters} \times \text{epochs} / 2000)})$.
Could you please explain the insight of the exponential of $d$? Why there is $\text{iters} \times \text{epochs}$ together with a magic number 2000?
@yinchimaoliang Thanks for your great work.
The EMA update is formulated as $model{EMA} = d \times model{EMA} + (1-d) \times model_{current}$, where $d=0.9999 \times (1- e^{-(\text{iters} \times \text{epochs} / 2000)})$.
Could you please explain the insight of the exponential of $d$? Why there is $\text{iters} \times \text{epochs}$ together with a magic number 2000?