Controlling randomness of environments

krezelj / academia

Repository for a Bachelor's Thesis at Warsaw University of Technology that touches on the topic of Curriculum Learning.

MIT License

7 stars 0 forks source link

Good idea, I played with environment seeding a while back while implementing random states in PPO. One thing I've noticed back then was that even with seeded reset and seeded agent only the first episode was fully reproducible. For some reason subsequent episodes started to diverge more and more overtime. I remember reading something about wind power (I should mention I was using LunarLander) not using the seed passed to reset but now I can't find it so I'm most likely misremembering something.

Anyway we should still investigate what the issue was.

It's probably not going to be an issue with minigrid but I'm a bit concerned about Atari games since they use emulated console. Hopefully gym environment seed is passed down to the emulation somehow.

krezelj / academia

Controlling randomness of environments #96