krezelj / academia

Repository for a Bachelor's Thesis at Warsaw University of Technology that touches on the topic of Curriculum Learning.
MIT License
7 stars 0 forks source link

Controlling randomness of environments #96

Closed maciejors closed 11 months ago

maciejors commented 11 months ago

We've already got a random_state parameter for agents but to make experiments fully reproducible, we need to add a similar parameter to environments.

Gymnasium environments can have a seed parameter passed to reset() method (example). We need to figure out how to pass it in a way that doesn't make these environments identical each time they're reset.

krezelj commented 11 months ago

Good idea, I played with environment seeding a while back while implementing random states in PPO. One thing I've noticed back then was that even with seeded reset and seeded agent only the first episode was fully reproducible. For some reason subsequent episodes started to diverge more and more overtime. I remember reading something about wind power (I should mention I was using LunarLander) not using the seed passed to reset but now I can't find it so I'm most likely misremembering something.

Anyway we should still investigate what the issue was.

It's probably not going to be an issue with minigrid but I'm a bit concerned about Atari games since they use emulated console. Hopefully gym environment seed is passed down to the emulation somehow.