Khrylx / PyTorch-RL

PyTorch implementation of Deep Reinforcement Learning: Policy Gradient methods (TRPO, PPO, A2C) and Generative Adversarial Imitation Learning (GAIL). Fast Fisher vector product TRPO.
MIT License
1.1k stars 188 forks source link

question about weight init #14

Closed gunshi closed 5 years ago

gunshi commented 5 years ago

Hi, Your implementation is great and easy to read. I just had one question though, from the line: https://github.com/Khrylx/PyTorch-RL/blob/15b574f5d52f5eeab6917c90c17e8739578f3d96/models/mlp_policy.py#L24 Is there any particular reason why the weights are initialized like that(instead of the normal gaussian/xavier's scheme) with that specific scale? Thanks! Gunshi

Khrylx commented 5 years ago

Hi,

This is just a choice inherited from other reference implementation. I find it working well in practice, although better initialization could yield better results for some problems.

Ye

gunshi commented 5 years ago

Thanks for the answer!