Open Myrkiriad-coder opened 3 years ago
Sorry if I'm so late to reply. Thank you for the advice. Actually, you can set the entropy coefficient to 0 if you use static parameters.
I really think using a neural network to calculate the std is much better than using static parameters. I forgot to do it in this repository
Describe the bug In ppo_continous_tensorflow.py, when you calculate entropy with:
dist_entropy = tf.math.reduce_mean(self.distributions.entropy(action_mean, self.std))
since entropy only depends on std and std is a static parameter, dist_entropy has always the same value all the time. Thus, entropy loss has no effect on learning.To Reproduce Launch any env and stop your debugger on dist_entropy. Check that it has the same value for every batch at any given point during learning.
Expected behavior Std shall not be static but somehow represent real prediction confidence of the network.