How to draw figure for episode reward?

MG2033 / A2C

A Clearer and Simpler Synchronous Advantage Actor Critic (A2C) Implementation in TensorFlow

Apache License 2.0

183 stars 37 forks source link

Closed gd-zhang closed 6 years ago

gd-zhang commented 6 years ago

In summary, I can only find reward for each environment. So I am supposed to average over all envs?

MG2033 commented 6 years ago

Yes, if you average over all environments, you'll get the average training performance.

However, it's done automatically in Tensorboard using the smoothing factor.