[Proposal] Migrating the base `IsaacEnv`

isaac-sim / IsaacLab

Unified framework for robot learning built on NVIDIA Isaac Sim

Other

1.85k stars 707 forks source link

Proposal

Gymnasium is the maintained version of OpenAI Gym that is now handled by Farama Foundation. It is currently the definition used in libraries such as rllib and tianshou, while others like stable-baselines3 will also update soon.

More recently, there has been an interesting proposal on RL environment definition in the framework torchrl which would fit well for our targetted applications.

This thread aims to serve as a voting/staging on what would be the best environment definition to use.

Motivation

Migrating to Gym 0.28

The current Gym definition (from 0.21.0) is outdated and depends on older libraries such as importlib-meta==4.1 package. This creates conflicts with new updated packages and it would be best to switch to the new Gym definition.

Related issues: OIGE #28, Sb3 #1327

Once SB3 also upgrades to Gymnasium, then it would be best to update everything to use the latest definitions.

More information on Gymnasium:

Migration guide: https://gymnasium.farama.org/content/migration-guide/
Compatibility guide: https://gymnasium.farama.org/content/gym_compatibility/

Migrating to TorchRL definition

The environment definition EnvBase which relies heavily on using tensordict.

The advantages of their base class are:

tensordict makes it easier to work with given the scaling possible with Isaac Sim and should be more efficient/flexible than dict of torch.Tensor
It allows having complex datatypes (continuous/discrete, images/proprioception) for both observation and action spaces. This helps generalize to a wider audience (such as multi-agent learning)

Related Issues: torchrl #883

Effect on the remaining framework

Since there are wrappers for other RL frameworks (RL-Games and RSL-RL), this probably won't cause any breaking changes on that side. We will just need to adapt them based on the chosen IsaacEnv definition.

Checklist

[x] I have checked that there is no similar issue in the repo (required)

from typing import Protocol, runtime_checkable, Any, SupportsFloat, Optional @dataclass class StepReturn: observation: Any reward: SupportsFloat terminated: bool truncated: bool #info: Info | None @dataclass class ResetReturn: observation: Any #info: Info | None @runtime_checkable class EnvCompatible(Protocol): def step( self, action: Any ) -> StepReturn: """Step.""" def reset( self, seed: Optional[Any], options: Optional[Any]) -> ResetReturn: """Reset."""

isaac-sim / IsaacLab