openai / baselines

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms
MIT License
15.84k stars 4.88k forks source link

Question about gaussian distribution internals. #1232

Open bolshoytoster opened 5 months ago

bolshoytoster commented 5 months ago

I'm trying to port ppo2 to rust, and I've managed to mostly do this. I have, however come across something I don't understand.

In baselines/common/distributions.py, when creating the DiagGaussianDistribution, the tensor mean is multiplied by 0, then added to logstd: https://github.com/openai/baselines/blob/ea25b9e8b234e6ee1bca43083f8f3cf974143998/baselines/common/distributions.py#L105

Can anyone explain why this isn't just this?

pdparam = tf.concat([mean, logstd], axis=1)