hill-a / stable-baselines

A fork of OpenAI Baselines, implementations of reinforcement learning algorithms
http://stable-baselines.readthedocs.io/
MIT License
4.15k stars 723 forks source link

[question] ppo2 and prioritized experience replay #131

Closed AloshkaD closed 5 years ago

AloshkaD commented 5 years ago

I need to test ppo2 with a prioritized experience replay and I wonder if anyone wrote a similar integration before I go ahead and write it from scratch.

araffin commented 5 years ago

Hello, PPO is meant to be on-policy (the policy that generates samples needs to be the same that is optimized (and not an older version)) so I don t think it really makes sense to have an experience replay in that case.

AloshkaD commented 5 years ago

My bad, I meant to say dueling dqn. I was coding a ppo2 and was stuck in my head.

araffin commented 5 years ago

Well then, i don t really understand your question neither. Prioritized experience replay is already implemented for ddqn in stable baselines

AloshkaD commented 5 years ago

I see, I missed that in the code. I'll give it a second look. Thanks!