Performance Issue on Vanilla Policy Gradient

yamatokataoka / reinforcement-learning-replications

Reinforcement Learning Replications is a set of Pytorch implementations of reinforcement learning algorithms.

MIT License

25 stars 1 forks source link

Performance Issue on Vanilla Policy Gradient #1

Closed yamatokataoka closed 4 years ago

yamatokataoka commented 4 years ago

My initial implementation of the VPG tooks around 300 sec to excute 200,000 environment interations. It's about 2x slower than OpenAI Spinning Up implementation.

ran this with cProfile module.

import gym
from rl_replicas.vpg.vpg import VPG
from rl_replicas.common.policies import ActorCriticPolicy

env = gym.make('CartPole-v0')

model = VPG(ActorCriticPolicy, env, seed=1)

model.learn()

07102020_performance_spiningup_vpg.txt

07102020_performance_rl_replicas_vpg.txt

yamatokataoka commented 4 years ago

Solved with https://github.com/yamatokataoka/reinforcement-learning-replications/commit/cc0b1c777ff2aa7f4a4fd797d2b7a93e419fb7ea

new performance sample 07102020_performance_vpg_separate_experience_and_gradient.txt