Farama-Foundation / Gymnasium

An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
https://gymnasium.farama.org
MIT License
7.4k stars 836 forks source link

[Proposal] Deepcopy of an Environment Object #737

Closed sdpkjc closed 8 months ago

sdpkjc commented 1 year ago

Proposal

I am proposing the addition of true deepcopy or snapshot functionality to the utils.EzPickle class in the Gymnasium library.

Motivation

While working with the Gymnasium library, I realized that most environment classes inherit from the utils.EzPickle class. However, I've noticed a potential issue with this class. When using the deepcopy function, the EzPickle class discards the current environment state and creates a new one instead. This behaviour seems problematic for use cases that require keeping the environment state after deepcopy.

Is there a specific reason why EzPickle was designed in this way? Is it due to limitations that prevent real deepcopy from being implemented? I am asking this particularly because I am willing to draft a PR with the implementation of true deepcopy or snapshot functionality. I believe this feature would greatly benefit the library.

Additionally, I noticed https://github.com/openai/gym/issues/402 which could be related.

Pitch

Providing a true deepcopy or snapshot functionality would improve the usability of the Gymnasium library. This change would allow users to easily store the state of a certain environment, facilitating tasks such as training reinforcement learning algorithms, which often require the ability to interact with multiple environment instances.

Alternatives

An alternative solution could be to create a new class that implements true deepcopy or snapshot. Another possibility would be to support clone_state and restore_state functions, similar to the current Atari environment, that would cater to this functionality. However, since EzPickle is already widely used in the library, and considering that this feature might be essential for several users, I believe modifying the EzPickle class would be more beneficial.

Additional context

No response

Checklist

sdpkjc commented 1 year ago

related to #94

pseudo-rnd-thoughts commented 1 year ago

This is certainly an interesting idea however I'm not sure how much Gymnasium can do as an API

It might be possible to implement a clone_state or restore_state that defaults to NotImplemented We could add the functionality to gymnasium and to the check_env function for testing environments.

However, I'm worried that this is too late in its development to do this and additionally the inconsistencies within environment implementations make this difficult. If starting again, I design Env to have a state attribute containing all the environment information, probably as a dataclass, making implementing such a function straightforward. However I don't think a change like this is possible anymore

dm-ackerman commented 8 months ago

It might be possible to implement a clone_state or restore_state that defaults to NotImplemented We could add the functionality to gymnasium and to the check_env function for testing environments.

However, I'm worried that this is too late in its development to do this and additionally the inconsistencies within environment implementations make this difficult.

Is the fact that it is late in the development a reason not to do this as an optional feature? (related: #842). Maybe not all envs could/would support this feature, but is there value in standardizing how it would be done for envs that do support it?