How to convert timestep based learning to episodic learning

hill-a / stable-baselines

A fork of OpenAI Baselines, implementations of reinforcement learning algorithms

MIT License

4.16k stars 725 forks source link

Hello,

I also have the same problem, but I have found thoses links. Maybe it could help you. https://github.com/hill-a/stable-baselines/issues/352 https://github.com/hill-a/stable-baselines/issues/776

I still didnt not try the proposed solution, because I need to figure out, how could I combine it (proposed solution), with calling my seed in my env. So everytime I start a new episode "i" , the algo pass the "i" in the seed of my env (seed(i)) so I can sample new reproducible values.

Anyway Good luck .

hill-a / stable-baselines

How to convert timestep based learning to episodic learning #1175