Closed gd-zhang closed 6 years ago
In summary, I can only find reward for each environment. So I am supposed to average over all envs?
Yes, if you average over all environments, you'll get the average training performance.
However, it's done automatically in Tensorboard using the smoothing factor.
In summary, I can only find reward for each environment. So I am supposed to average over all envs?