Closed brantbzhang closed 5 years ago
This code only supports running batch normalization using the statistics of the current batch (is_training = True), as this is what is effective at training time. You're correct that if you add the UPDATE_OPS dependencies to the graph you'll be able to run batch normalization using a moving average of those statistics (is_training = False), which will usually slightly improve test time performance.
int ppo2.py have tf.get_collection(tf.GraphKeys.UPDATE_OPS) but not use like:
method step and value in CnnPolicy, why not set batch_norm(is_training=False)