Farama-Foundation / Gymnasium

An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
https://gymnasium.farama.org
MIT License
7.48k stars 839 forks source link

[Proposal] Initial State Tracking - Bipedal Walker Hardcore #892

Open arthur-plautz opened 10 months ago

arthur-plautz commented 10 months ago

Proposal

Create methods that are able to retrieve and set, the initial states that uniquely represents the terrain created for Bipedal Walker Hardcore.

Motivation

I'm currently working on a research on curriculum learning, based on a paper that used the bipedal walker hardcore problem to demonstrate that a selection on the initial conditions of the experiment could improve the overall performance. My current work implies extracting the states used to create the terrain, so that I can use it to automatically select this initial states along the experiment, and also set the states to generate a specific terrain. I was working on some changes in the bipedal environment locally, but it occurred to me that this could benefit other people that may have the same issue.

Pitch

The idea is to have a parameter in the reset method that allows the overwriting of the states used to generate the terrain, and also the persistence of those states, available through an attribute, similar to the terrain attribute.

Alternatives

We may have other ideas of how to do it, but I think the solution would involve a method to set the states and an attribute to access them. Feel free to make your suggestions 😄

Additional context

The paper: Automated curriculum learning for embodied agents a neuroevolutionary approach. Current Fork: https://github.com/arthur-plautz/gym

Checklist

pseudo-rnd-thoughts commented 10 months ago

This should be possible with the options argument to pass an intended state for the environment to use

arthur-plautz commented 10 months ago

Yes, but the options argument is not being used anywhere on the reset() method: https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/envs/box2d/bipedal_walker.py

It's possible that it was supposed to be included on the super().reset() method? Sorry if I missed something here.

pseudo-rnd-thoughts commented 10 months ago

What I mean is that this solution could be implemented using the options arguments such that the initial state can be passed through options, i.e.,options["state"] = new_state.

arthur-plautz commented 10 months ago

Oh, I see, I'll make the changes then open a PR, thank you for your help!