Farama-Foundation / Gymnasium

An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
https://gymnasium.farama.org
MIT License
7.32k stars 817 forks source link

[Proposal] Wrapper rewrite #140

Closed pseudo-rnd-thoughts closed 1 year ago

pseudo-rnd-thoughts commented 1 year ago

Proposal

Gymnasium already contains a large collection of wrappers, but we believe that the wrappers can be improved to

  1. Support arbitrarily complex observation / action spaces. As RL has advanced, action and observation spaces are becoming more complex and the current wrappers were not implemented with these spaces in mind.
  2. Support for numpy, jax and pytorch data. With hardware accelerated environments, i.e. Brax, written in Jax and similar pytorch based programs, numpy is not the only game in town anymore. Therefore, these upgrades will use Jumpy for calling numpy, jax and torch depending on the data.
  3. More wrappers. Projects like Supersuit aimed to bring more wrappers for RL however wrappers can be moved into Gymnasium.
  4. Versioning. Like environments, the implementation details of wrapper can cause changes agent performance. Therefore, we propose adding version numbers with all wrappers.
  5. In v28, we aim to rewrite the VectorEnv to not inherit from Env, as a result new vectorised versions of the wrappers will be provided.

Motivation

No response

Pitch

Lambda Observation Wrappers - wrappers.lambda_observation

Old name New name func tree struct vector version
TransformObservation LambdaObservation - VectorLambdaObservation
FilterObservation FilterObservation y vectorise
FlattenObservation FlattenObservation x vectorise
GrayScaleObservation GrayscaleObservation y vectorise
PixelObservationWrapper PixelObservation x vectorise
ResizeObservation ResizeObservation y vectorise
- ReshapeObservation y vectorise
- RescaleObservation y vectorise
- DTypeObservation y vectorise
NormalizeObservation NormalizeObservation x VectorNormalizeObservation
TimeAwareObservation TimeAwareObservation - VectorTimeAwareObservation
FrameStack FrameStackObservation - VectorFrameStack
- DelayObservation - VectorDelayObservation
AtariProcessing AtariPreprocessing - -

Lambda Action Wrappers - wrappers.lambda_action

Old name New name func tree structure vector version
- LambdaAction - VectorLambdaAction
ClipAction ClipAction y vectorise
RescaleAction RescaleAction y vectorise
- NanAction y vectorise
- StickyAction - VectorStickAction

Lambda Reward Wrappers - wrappers.lambda_reward

Old name New name Vector version
TransformReward LambdaReward VectorLambdaReward
ClipReward ClipReward vectorise
- RescaleReward vectorise
NormalizeReward NormalizeReward VectorNormalizeReward

Common Wrappers - wrappers.common

Old name new name Vector version
AutoResetWrapper AutoReset -
PassiveEnvChecker PassiveEnvChecker -
OrderEnforcing OrderEnforcing vectorise
EnvCompatibility remove for shimmy -
RecordEpisodeStatistics RecordEpisodeStatistics VectorRecordEpisodeStatistics
RecordVideo RecordVideo VectorRecordVideo
RenderCollection RenderCollection VectorRenderCollection
HumanRendering HumanRendering -
- JaxToNumpy - vectorise
- JaxToTorch - vectorise

Vector Only Wrappers - vector.wrappers.common

Old name New name
VectorListInfo VectorListInfo

Alternatives

No response

Additional context

No response

Checklist

WillDudley commented 1 year ago
  1. Relevant: #181
  2. This proposal should also consider and be developed alongside PettingZoo 2.
WillDudley commented 1 year ago
  1. Wrappers should be versioned like environments.