nikhilbarhate99 / PPO-PyTorch

Minimal implementation of clipped objective Proximal Policy Optimization (PPO) in PyTorch
MIT License
1.67k stars 343 forks source link

forget to copy policy to policy_old during ppo initialization? #10

Closed YilunZhou closed 5 years ago

YilunZhou commented 5 years ago

Should there be a self.policy_old.load_state_dict(self.policy.state_dict()) on line 85 of PPO.py, after the initialization of PPO object? PyTorch random initialization does not guarantee that these two policies will be the same. And the same issue for PPO_continuous.py.

nikhilbarhate99 commented 5 years ago

Thanks !