isaac-sim / IsaacLab

Unified framework for robot learning built on NVIDIA Isaac Sim
https://isaac-sim.github.io/IsaacLab
Other
1.97k stars 787 forks source link

[Bug Report] Multiple Tiled Cameras Override Each Other #1070

Open ori-gadot opened 3 days ago

ori-gadot commented 3 days ago

Bug Description

When using two tiled cameras in the scene, the last camera overrides all observations, resulting in both cameras capturing the same image.

Steps to reproduce

  1. Run the following script, which initializes the CartpoleRGBCameraEnvCfg environment with two environments and two tiled cameras.
from omni.isaac.lab.app import AppLauncher

app_launcher = AppLauncher(headless=False, enable_cameras=True)
simulation_app = app_launcher.app

"""Rest everything follows."""

import torch

from omni.isaac.lab.envs import ManagerBasedRLEnv
from omni.isaac.lab.sensors.camera.utils import save_images_to_file

from omni.isaac.lab_tasks.manager_based.classic.cartpole import CartpoleRGBCameraEnvCfg

def main():
    env_cfg = CartpoleRGBCameraEnvCfg()
    env_cfg.scene.num_envs = 2
    env = ManagerBasedRLEnv(cfg=env_cfg)

    while simulation_app.is_running():
        with torch.inference_mode():
            actions = torch.rand_like(env.action_manager.action)
            obs, _, _, _, _ = env.step(actions)
            save_images_to_file(obs["policy"]["image1"], "/tmp/image1.png")
            save_images_to_file(obs["policy"]["image2"], "/tmp/image2.png")
    env.close()

if __name__ == "__main__":
    main()
    simulation_app.close()
  1. Modify the CartpoleRGBCameraSceneCfg class in the following file: source/extensions/omni.isaac.lab_tasks/omni/isaac/lab_tasks/manager_based/classic/cartpole/cartpole_camera_env_cfg.py, adding two tiled cameras as shown below:
@configclass
class CartpoleRGBCameraSceneCfg(CartpoleSceneCfg):
    tiled_camera1: TiledCameraCfg = TiledCameraCfg(
        prim_path="{ENV_REGEX_NS}/Camera1",
        offset=TiledCameraCfg.OffsetCfg(pos=(-7.0, 0.0, 3.0), rot=(0.9945, 0.0, 0.1045, 0.0), convention="world"),
        data_types=["rgb"],
        spawn=sim_utils.PinholeCameraCfg(
            focal_length=24.0, focus_distance=400.0, horizontal_aperture=20.955, clipping_range=(0.1, 20.0)
        ),
        width=1280,
        height=720,
    )

    tiled_camera2: TiledCameraCfg = TiledCameraCfg(
        prim_path="{ENV_REGEX_NS}/Camera2",
        offset=TiledCameraCfg.OffsetCfg(pos=(-7.0, 0.0, 3.0), convention="world"),
        data_types=["rgb"],
        spawn=sim_utils.PinholeCameraCfg(
            focal_length=24.0, focus_distance=400.0, horizontal_aperture=20.955, clipping_range=(0.1, 20.0)
        ),
        width=1280,
        height=720,
    )
  1. Modify RGBObservationsCfg to include observations for both cameras:
@configclass
class RGBObservationsCfg:
    """Observation specifications for the MDP."""

    @configclass
    class RGBCameraPolicyCfg(ObsGroup):
        """Observations for policy group."""

        image1 = ObsTerm(func=grab_images, params={"sensor_cfg": SceneEntityCfg("tiled_camera1"), "data_type": "rgb", "normalize": True})
        image2 = ObsTerm(func=grab_images, params={"sensor_cfg": SceneEntityCfg("tiled_camera2"), "data_type": "rgb", "normalize": True})

        def __post_init__(self) -> None:
            self.enable_corruption = False
            self.concatenate_terms = False

    policy: ObsGroup = RGBCameraPolicyCfg()
  1. Modify the scene lights in source/extensions/omni.isaac.lab_tasks/omni/isaac/lab_tasks/manager_based/classic/cartpole/cartpole_env_cfg.py for clearer image results:
@configclass
class CartpoleSceneCfg(InteractiveSceneCfg):
    """Configuration for a cart-pole scene."""

    # ground plane
    ground = AssetBaseCfg(
        prim_path="/World/ground",
        spawn=sim_utils.GroundPlaneCfg(size=(100.0, 100.0), color=(0.5, 0.5, 0.5)),
    )

    # cartpole
    robot: ArticulationCfg = CARTPOLE_CFG.replace(prim_path="{ENV_REGEX_NS}/Robot")

    # lights
    dome_light = AssetBaseCfg(
        prim_path="/World/DomeLight",
        spawn=sim_utils.DomeLightCfg(color=(0.9, 0.9, 0.9), intensity=10.0),
    )
    distant_light = AssetBaseCfg(
        prim_path="/World/DistantLight",
        spawn=sim_utils.DistantLightCfg(color=(0.9, 0.9, 0.9), intensity=1.0),
        init_state=AssetBaseCfg.InitialStateCfg(rot=(0.738, 0.477, 0.477, 0.0)),
    )

    sphere_light = AssetBaseCfg(
        prim_path="{ENV_REGEX_NS}/SphereLight",
        spawn=sim_utils.SphereLightCfg(
            color=(0.1, 0.1, 0.1), enable_color_temperature=True, color_temperature=5500, intensity=100, radius=2.0, exposure=10.0,
        ),
        init_state=AssetBaseCfg.InitialStateCfg(pos=(0, 0, 5)),
    )

Additional context

Two identical tiled cameras with slightly different orientations (one capturing more of the ground, the other more of the sky) were used. Observation data was saved as images for inspection.

Running the following scenarios, only the first one reproduces the bug:

Note: Although it may be unrelated, there is a noticeable decline in image quality in the first scenario compared to the other two.

garylvov commented 3 days ago

I can confirm that this issue already happens by the time that the image is read by grab_images in the observation group, but only for Tiled Cameras (standard USD cameras look like they work).

I've combined your repro steps in a single standalone file for easier troubleshooting here: https://gist.github.com/garylvov/d9cd169883838bd5b43d2f7b6d75c52c image

garylvov commented 3 days ago

@Dhoeller19 @Mayankm96 I believe that this may be on Isaac sim side, or maybe the buffers in tiled_camera.py aren't created/used correctly

WangYCheng23 commented 2 days ago

same issue.

kellyguo11 commented 2 days ago

Thanks for the report, I don't believe we can currently have multiple tiled cameras at different resolutions as the resolution settings are global. We'll look into adding support for this.

glvov-bdai commented 2 days ago

@kellyguo11 I believe this issue still happens even if the multiple tiled cameras are the same resolution

kellyguo11 commented 1 day ago

Yes, you are right. It seems currently we can only have one tiled camera. The fix is in the works to support multiple tiled cameras, including at different resolutions. Hopefully it'll be made available in the next release.

elle-miller commented 1 day ago

@kellyguo11 Just to add to the discussion, as described in https://github.com/isaac-sim/IsaacLab/issues/992 I can successfully render 2x tiled cameras and get different images (same resolution), but there is an issue with correctly setting both poses.

glvov-bdai commented 1 day ago

@elle-miller Has an interesting observation @kellyguo11 with #992 .

EDIT: I believe that #1070 is still an issue as #992 looks like it produces different images, but I believe that they are the same image just with different noise levels.

I can confirm that when I set the camera views with set_world_poses_from_view, I am able to get different images from different tiled cameras in the same environment.

Using set_world_poses_from_view : https://gist.github.com/glvov-bdai/767ad9d76ea2a10b625fe7cbfbc21c20

image

kensukenk commented 6 hours ago

Hi, I have a similar issue.

When I have a Franka environment with a front-facing and wrist camera, I can get the Tiled Images to render properly. However, the moment I have two environments, the TiledCamera on the wrist has an incorrect orientation:

test_wrist0 Above is what my wrist camera is supposed to look like. It tracks the arm normally.

test_wrist3 This is what the wrist camera looks like if I have multiple environments. However, my front camera renders properly (i.e., it doesn't seem like the wrist camera overwrites the front camera. I will sat that the front camera suffers from #1031

test_front Normal front camera view

test_front2 Front Camera view if I have parallelized environments. Here, the block appear distorted.

glvov-bdai commented 5 hours ago

@kensukenk Can you please be a little bit more explicit about what the problem is? Is it the camera not being in the desired orientation when using Tiled Cameras but it is in the correct orientation with USD Cameras?

If you are using set_world_poses for your camera (or anything related to pose), I'd double check that the pose convention is consistent ("opengl" or "ros" or "world) for all cameras. From what I was able to piece together from your photos, I'd guess that the usd camera is in OpenGL format, whetheras the Tiled camera may be in ROS format, but I'm not sure.

from omni.isaac.lab.sensors.camera.camera.py

    Since different fields use different conventions for camera orientations, the method allows users to
    set the camera poses in the specified convention. Possible conventions are:

    - :obj:`"opengl"` - forward axis: -Z - up axis +Y - Offset is applied in the OpenGL (Usd.Camera) convention
    - :obj:`"ros"`    - forward axis: +Z - up axis -Y - Offset is applied in the ROS convention
    - :obj:`"world"`  - forward axis: +X - up axis +Z - Offset is applied in the World Frame convention

    See :meth:`omni.isaac.lab.sensors.camera.utils.convert_orientation_convention` for more details
    on the conventions.

Can you please share code to replicate your results? I'm surprised that the multiple cameras don't override each other and would be curious to investigate this further.

kensukenk commented 5 hours ago

Sure, my code is a modification of the Franka Cube Lift env. My issue is that when I use only one environment, my TiledCamera renders correctly. However, when I use two or more environments in parallel (and change nothing else), the TiledCamera on the franka wrist flips orientation.

I added cameras to the lift_env_cfg.py file as

'class ObjectTableSceneCfg(InteractiveSceneCfg): """Configuration for the lift scene with a robot and a object. This is the abstract base implementation, the exact scene is defined in the derived classes which need to set the target object, robot and end-effector frames """

# robots: will be populated by agent env cfg
robot: ArticulationCfg = MISSING
# end-effector sensor: will be populated by agent env cfg
ee_frame: FrameTransformerCfg = MISSING
# target object: will be populated by agent env cfg
object: RigidObjectCfg = MISSING
#camera: TiledCameraCfg = MISSING
wrist_cam: CameraCfg | TiledCameraCfg = MISSING
front_cam: CameraCfg | TiledCameraCfg = MISSING'

I also set self.rerender_on_reset=True, in case that matters.

and ` @configclass class FrankaCubeLiftEnvCfg(LiftEnvCfg): def __post_init__(self):

post init of parent

    super().__post_init__()

    # Set Franka as robot
    self.scene.robot = FRANKA_PANDA_CFG.replace(prim_path="{ENV_REGEX_NS}/Robot")

    # Set actions for the specific robot type (franka)
    self.actions.body_joint_pos = mdp.JointPositionActionCfg(
        asset_name="robot", joint_names=["panda_joint.*"], scale=0.5, use_default_offset=True
    )
    self.actions.finger_joint_pos = mdp.BinaryJointPositionActionCfg(
        asset_name="robot",
        joint_names=["panda_finger.*"],
        open_command_expr={"panda_finger_.*": 0.04},
        close_command_expr={"panda_finger_.*": 0.0},
    )
    # Set the body name for the end effector
    #self.commands.object_pose.body_name = "panda_hand"

    '''# Set Cube as object
    self.scene.object = RigidObjectCfg(
        prim_path="{ENV_REGEX_NS}/Object",
        init_state=RigidObjectCfg.InitialStateCfg(pos=[0.5, 0, 0.055], rot=[1, 0, 0, 0]),
        spawn=UsdFileCfg(
            usd_path=f"{ISAAC_NUCLEUS_DIR}/Props/Blocks/DexCube/dex_cube_instanceable.usd",
            scale=(0.8, 0.8, 4.0),
            rigid_props=RigidBodyPropertiesCfg(
                solver_position_iteration_count=16,
                solver_velocity_iteration_count=1,
                max_angular_velocity=1000.0,
                max_linear_velocity=1000.0,
                max_depenetration_velocity=5.0,
                disable_gravity=False,
            ),
        ),
    )'''
    self.scene.left =  RigidObjectCfg(
        prim_path="{ENV_REGEX_NS}/left",
        spawn=sim_utils.CuboidCfg(
            size=[0.05, 0.05, 0.2],
            rigid_props=sim_utils.RigidBodyPropertiesCfg(),
            mass_props=sim_utils.MassPropertiesCfg(mass=1.0),
            collision_props=sim_utils.CollisionPropertiesCfg(),
            visual_material=sim_utils.PreviewSurfaceCfg(diffuse_color=(1.0, 0.0, 0.0), metallic=0.2),
        ),
        init_state=RigidObjectCfg.InitialStateCfg(pos=(0.6, 0.07, 0.18), rot=(1.0, 0.0, 0.0, 0.0)),
    )
    self.scene.object =  RigidObjectCfg(
        prim_path="{ENV_REGEX_NS}/Object",
        spawn=sim_utils.CuboidCfg(
            size=[0.05, 0.05, 0.2],
            rigid_props=sim_utils.RigidBodyPropertiesCfg(),
            mass_props=sim_utils.MassPropertiesCfg(mass=1.0),
            collision_props=sim_utils.CollisionPropertiesCfg(),
            visual_material=sim_utils.PreviewSurfaceCfg(diffuse_color=(0.0, 1.0, 0.0), metallic=0.2),
        ),
        init_state=RigidObjectCfg.InitialStateCfg(pos=[0.6, 0, 0.18], rot=[1.0, 0.0, 0.0, 0.0]),
    )
    self.scene.right =  RigidObjectCfg(
        prim_path="{ENV_REGEX_NS}/right",
        spawn=sim_utils.CuboidCfg(
            size=[0.05, 0.05, 0.2],
            rigid_props=sim_utils.RigidBodyPropertiesCfg(),
            mass_props=sim_utils.MassPropertiesCfg(mass=1.0),
            collision_props=sim_utils.CollisionPropertiesCfg(),
            visual_material=sim_utils.PreviewSurfaceCfg(diffuse_color=(1.0, 0.0, 0.0), metallic=0.2),
        ),
        init_state=RigidObjectCfg.InitialStateCfg(pos=(0.6, -0.07, 0.18), rot=(1.0, 0.0, 0.0, 0.0)),
    )

    self.scene.base =  RigidObjectCfg(
        prim_path="{ENV_REGEX_NS}/base",
        spawn=sim_utils.CuboidCfg(
            size=[0.7, 0.7, 0.08],
            rigid_props=sim_utils.RigidBodyPropertiesCfg(),
            mass_props=sim_utils.MassPropertiesCfg(mass=10.0),
            collision_props=sim_utils.CollisionPropertiesCfg(),
            visual_material=sim_utils.PreviewSurfaceCfg(diffuse_color=(0.0, 0.0, 1.0), metallic=0.2),
        ),
        init_state=RigidObjectCfg.InitialStateCfg(pos=(0.5, -0.00, 0.05), rot=(1.0, 0.0, 0.0, 0.0)),
    )
    self.scene.wall1 =  RigidObjectCfg(
        prim_path="{ENV_REGEX_NS}/wall1",
        spawn=sim_utils.CuboidCfg(
            size=[2.5, 0.1, 3.],
            rigid_props=sim_utils.RigidBodyPropertiesCfg(),
            mass_props=sim_utils.MassPropertiesCfg(mass=10.0),
            collision_props=sim_utils.CollisionPropertiesCfg(),
            visual_material=sim_utils.PreviewSurfaceCfg(diffuse_color=(1.0, 1.0, 1.0), metallic=0.2),
        ),
        init_state=RigidObjectCfg.InitialStateCfg(pos=(0., 1.50, 0.0), rot=(1.0, 0.0, 0.0, 0.0)),
    )
    self.scene.wall2 =  RigidObjectCfg(
        prim_path="{ENV_REGEX_NS}/wall2",
        spawn=sim_utils.CuboidCfg(
            size=[2.5, 0.3, 3.],
            rigid_props=sim_utils.RigidBodyPropertiesCfg(),
            mass_props=sim_utils.MassPropertiesCfg(mass=10.0),
            collision_props=sim_utils.CollisionPropertiesCfg(),
            visual_material=sim_utils.PreviewSurfaceCfg(diffuse_color=(1.0, 1.0, 1.0), metallic=0.2),
        ),
        init_state=RigidObjectCfg.InitialStateCfg(pos=(0., -1.50, 0.0), rot=(1.0, 0.0, 0.0, 0.0)),
    )
    self.scene.wall3 =  RigidObjectCfg(
        prim_path="{ENV_REGEX_NS}/wall3",
        spawn=sim_utils.CuboidCfg(
            size=[0.3, 3, 3.],
            rigid_props=sim_utils.RigidBodyPropertiesCfg(),
            mass_props=sim_utils.MassPropertiesCfg(mass=10.0),
            collision_props=sim_utils.CollisionPropertiesCfg(),
            visual_material=sim_utils.PreviewSurfaceCfg(diffuse_color=(1.0, 1.0, 1.0), metallic=0.2),
        ),
        init_state=RigidObjectCfg.InitialStateCfg(pos=(1.4, 0.00, 0.0), rot=(1.0, 0.0, 0.0, 0.0)),
    )
    self.scene.wall4 =  RigidObjectCfg(
        prim_path="{ENV_REGEX_NS}/wall4",
        spawn=sim_utils.CuboidCfg(
            size=[0.3, 3, 3.],
            rigid_props=sim_utils.RigidBodyPropertiesCfg(),
            mass_props=sim_utils.MassPropertiesCfg(mass=10.0),
            collision_props=sim_utils.CollisionPropertiesCfg(),
            visual_material=sim_utils.PreviewSurfaceCfg(diffuse_color=(1.0, 1.0, 1.0), metallic=0.2),
        ),
        init_state=RigidObjectCfg.InitialStateCfg(pos=(-1.4, 0.00, 0.0), rot=(1.0, 0.0, 0.0, 0.0)),
    )

    # Listens to the required transforms
    marker_cfg = FRAME_MARKER_CFG.copy()
    marker_cfg.markers["frame"].scale = (0.1, 0.1, 0.1)
    marker_cfg.prim_path = "/Visuals/FrameTransformer"
    self.scene.ee_frame = FrameTransformerCfg(
        prim_path="{ENV_REGEX_NS}/Robot/panda_link0",
        debug_vis=False,
        visualizer_cfg=marker_cfg,
        target_frames=[
            FrameTransformerCfg.FrameCfg(
                prim_path="{ENV_REGEX_NS}/Robot/panda_hand",
                name="end_effector",
                offset=OffsetCfg(
                    pos=[0.0, 0.0, 0.1034],
                ),
            ),
        ],
    )

    self.scene.wrist_cam = TiledCameraCfg(
        prim_path="{ENV_REGEX_NS}/Robot/panda_hand/wrist_cam",
        update_period=0.01,
        height=128,
        width=128,
        data_types=["rgb"],
        spawn=PinholeCameraCfg(
            focal_length=24.0, focus_distance=400.0, horizontal_aperture=20.955, clipping_range=(0.1, 20)
        ),
        offset=TiledCameraCfg.OffsetCfg(pos=(0.1, 0.0, 0.0), rot=(0.6848517, -0.1750107, -0.1748713, 0.6853973), convention="ros"),
    )

    self.scene.front_cam = TiledCameraCfg(
            prim_path="{ENV_REGEX_NS}/front_cam",
        update_period=0.01,
        height=128,
        width=128,
        data_types=["rgb"],
        spawn=PinholeCameraCfg(
            focal_length=24.0, focus_distance=400.0, horizontal_aperture=20.955, clipping_range=(0.1, 20)
        ),
        offset=TiledCameraCfg.OffsetCfg(pos=(0.75, 0.0, 0.75), rot=(-0.1259888, 0.6960601, 0.695506, -0.1260892 ), convention="ros"),
    )

`

In particular, I defined the TiledCameras in the last text chunks directly above (self.scene.front_cam, self.scene.wrist_cam)

glvov-bdai commented 3 hours ago

@kensukenk Thanks for the additional information. This looks like a separate issue to me to do with how poses are being set, do you mind opening a new issue with this information so we can keep this issue more focused to cameras overriding each other?

Also if you're able to provide the steps to reproduce as a standalone file in the style of this gist: https://gist.github.com/garylvov/d9cd169883838bd5b43d2f7b6d75c52c

that would help the dev team trouble shoot this issue faster. Otherwise, we have to recreate/keep track of your modifications, which can slow things down.

Thanks!