When I ran mine.optimize, the code is complaining that in the 2nd batch, running_mean is a tensor. I'm confused about what the type of running_mean is supposed to be.
Your question is quite old, but I hope others find this useful. The line you are referring to only runs during the first batch of each epoch. It comes from the definition of an exponential moving average.
Hi,
I'm curious about your implementation of the EMA loss
When I ran mine.optimize, the code is complaining that in the 2nd batch, running_mean is a tensor. I'm confused about what the type of running_mean is supposed to be.
Did you mean
or something like that?