[Bug Report] Obs dim error when adding camera sensor

Describe the bug

When adding the camera sensor to obtain rgbd data, observation_manager can not parse the dimention well.

Steps to reproduce

define rgbd data

def camera_rgb(env: BaseEnv, asset_cfg: SceneEntityCfg = SceneEntityCfg("camera")) -> torch.Tensor:
    """The joint velocities of the asset."""
    # extract the used quantities (to enable type-hinting)
    asset = env.scene[asset_cfg.name]
    rgb = asset.data.output["rgb"]  # (num_env, 480, 640, 4), rgba, a means alpha透明度
    return rgb

def camera_depth(env: BaseEnv, asset_cfg: SceneEntityCfg = SceneEntityCfg("camera")) -> torch.Tensor:
    """The joint velocities of the asset."""
    # extract the used quantities (to enable type-hinting)
    asset = env.scene[asset_cfg.name]
    depth = asset.data.output["distance_to_image_plane"] # (num_env, 480, 640)
    return depth

define the rgb and depth observation

@configclass
class ObservationsCfg:
    """Observation specifications for the MDP."""

    @configclass
    class PolicyCfg(ObsGroup):
        """Observations for policy group."""

        # observation terms (order preserved)
        base_pos = ObsTerm(func=mdp.base_pos, params={"asset_cfg": SceneEntityCfg("robot")})
        base_orn  = ObsTerm(func=mdp.base_orn) # w, x, y, z
        rgb = ObsTerm(func=camera_rgb)
        depth = ObsTerm(func=camera_depth)

        def __post_init__(self) -> None:
            self.enable_corruption = False
            self.concatenate_terms = False

    # observation groups
    policy: PolicyCfg = PolicyCfg()

error

[INFO]: Time taken for simulation start : 1.775997 seconds
[INFO] Action Manager:  <ActionManager> contains 1 active terms.
+----------------------------------+
|  Active Action Terms (shape: 3)  |
+--------+------------+------------+
| Index  | Name       |  Dimension |
+--------+------------+------------+
|   0    | joint_pos  |          3 |
+--------+------------+------------+

2024-03-08 10:28:42 [31,090ms] [Error] [__main__] stack expects each tensor to be equal size, but got [1] at entry 0 and [3] at entry 2
2024-03-08 10:28:42 [31,091ms] [Error] [__main__] Traceback (most recent call last):
  File "/home/a/yizhenzhang/ov_orbit/source/standalone/embodied_ai/run_gen_data.py", line 189, in <module>
    main()
  File "/home/a/yizhenzhang/ov_orbit/source/standalone/embodied_ai/run_gen_data.py", line 56, in main
    env = BaseEnv(cfg=env_cfg)
  File "/home/a/yizhenzhang/ov_orbit/source/extensions/omni.isaac.orbit/omni/isaac/orbit/envs/base_env.py", line 131, in __init__
    self.load_managers()
  File "/home/a/yizhenzhang/ov_orbit/source/extensions/omni.isaac.orbit/omni/isaac/orbit/envs/base_env.py", line 204, in load_managers
    self.observation_manager = ObservationManager(self.cfg.observations, self)
  File "/home/a/yizhenzhang/ov_orbit/source/extensions/omni.isaac.orbit/omni/isaac/orbit/managers/observation_manager.py", line 48, in __init__
    self._group_obs_dim[group_name] = tuple(torch.sum(torch.stack(term_dims, dim=0), dim=0).tolist())
RuntimeError: stack expects each tensor to be equal size, but got [1] at entry 0 and [3] at entry 2

System Info

Describe the characteristic of your environment:

Commit: [93f3eff55341c2a07ca9e3f15717622d24d94584]
Isaac Sim Version: [2023.1.1]
OS: [Ubuntu 20.04]
GPU: [A6000]
CUDA: [11.7]
GPU Driver: [550.54.14]

Additional context

I think this bug is due to the observation_manager code does not consider this rgbd data type.

Checklist

[ x] I have checked that there is no similar issue in the repo (required)
[ x] I have checked that the issue is not in running Isaac Sim itself and is related to the repo

Acceptance Criteria

Add the criteria for which this task is considered done. If not known at issue creation time, you can add this once the issue is assigned.

[x ] Criteria 1
[ x] Criteria 2

Hey @Privilger this erorr appears as the manager is trying to stack all your observations in the PolicyCfg ObsGroup. you can avoid this error by creating an ObsGroup for every observation tensor that have the same size. From what I see you could possible have rgb and D data in ObsGroup and the other observation tensors in one ObsGroup each. For example,

@configclass
class ObservationsCfg:
    """Observation specifications for the MDP."""

    @configclass
    class RGBDCfg(ObsGroup):
        """Observations for policy group."""

        # observation terms (order preserved)
        rgb = ObsTerm(func=camera_rgb)
        depth = ObsTerm(func=camera_depth)

        def __post_init__(self) -> None:
            self.enable_corruption = False
            self.concatenate_terms = False

    @configclass
    class OrientationCfg(ObsGroup):
        """Observations for policy group."""

        # observation terms (order preserved)
        base_orn  = ObsTerm(func=mdp.base_orn) # w, x, y, z
        def __post_init__(self) -> None:
            self.enable_corruption = False
            self.concatenate_terms = False

    @configclass
    class PositionCfg(ObsGroup):
        """Observations for policy group."""

        # observation terms (order preserved)
        base_pos = ObsTerm(func=mdp.base_pos, params={"asset_cfg": SceneEntityCfg("robot")})
        def __post_init__(self) -> None:
            self.enable_corruption = False
            self.concatenate_terms = False                        

    # observation groups
    rgbd: RGBDCfg = RGBDCfg()
    position: PositionCfg = PositionCfg()
    orientation: OrientationCfg = OrientationCfg()

isaac-sim / IsaacLab