My initial implementation of the VPG tooks around 300 sec to excute 200,000 environment interations.
It's about 2x slower than OpenAI Spinning Up implementation.
ran this with cProfile module.
import gym
from rl_replicas.vpg.vpg import VPG
from rl_replicas.common.policies import ActorCriticPolicy
env = gym.make('CartPole-v0')
model = VPG(ActorCriticPolicy, env, seed=1)
model.learn()
My initial implementation of the VPG tooks around 300 sec to excute 200,000 environment interations. It's about 2x slower than OpenAI Spinning Up implementation.
ran this with cProfile module.
07102020_performance_spiningup_vpg.txt
07102020_performance_rl_replicas_vpg.txt