How to use RGB-D depth map with cuRobo?

pmj110119 commented 8 months ago

If it’s not a bug, please use discussions: https://github.com/NVlabs/curobo/discussions

Please provide the below information in addition to your issue:

cuRobo installation mode (choose from [python, isaac sim, docker python, docker isaac sim]): docker isaac sim
python version: /
Isaac Sim version (if using): 2022 version

Hi guys, thanks for your great work!!!

I want to use CuRobo with RGB-D camera, and use the api in examples:

data_camera = CameraObservation(  # rgb_image = data["rgba_nvblox"],
depth_image=data["depth"], intrinsics=data["intrinsics"], pose=camera_pose
            )
data_camera = data_camera.to(device=tensor_args.device)
world_model.add_camera_frame(data_camera, "world")
world_model.process_camera_frames("world", False)
torch.cuda.synchronize()
world_model.update_blox_hashes()

But even when RGB-D shoots the desktop, the point cloud should be very thin, but the generated occupancy is always very bloated ( a thick occupancy that approximates a cube is generated instead of a plane).

img_v3_026b_762c1f55-572e-46eb-9325-f9215b5e927g

img_v3_026b_49b8d20c-e81a-4874-bae8-168e20cbf93g

I'm confused about this, and I haven't found any other API examples or documentation. Is there a more accurate way to model point clouds?

balakumar-s commented 8 months ago

We nvblox to build a Euclidean Signed Distance Field (ESDF) from depth images. Nvblox stores SDF values in voxels at a given resolution which is 2cm in the example (https://github.com/NVlabs/curobo/blob/c09d94908d8ad52f557ed59195c1ceb2d0434d65/examples/isaac_sim/realsense_reacher.py#L192).
To render the SDF in Isaac Sim, we generate a grid of spheres (radius==voxel_size) and then query the voxelized ESDF representation for which spheres are in collision (either at the surface of an obstacle or inside an obstacle) here: https://github.com/NVlabs/curobo/blob/c09d94908d8ad52f557ed59195c1ceb2d0434d65/examples/isaac_sim/realsense_reacher.py#L303
Then given this list of colliding spheres, we render them as cuboids in Isaac Sim using a voxel manager: https://github.com/NVlabs/curobo/blob/c09d94908d8ad52f557ed59195c1ceb2d0434d65/examples/isaac_sim/realsense_reacher.py#L259

Note that our query spheres in (2) are twice the size of voxel_size, this could be reduced to exactly match the size of voxels.

Since we voxelize the world and store as SDF, we can only render up to the voxel_size.

You should still only see a plane of cubes if the camera is pointing to an empty table. Did you change the camera pose in simulation?

pmj110119 commented 8 months ago

We nvblox to build a Euclidean Signed Distance Field (ESDF) from depth images. Nvblox stores SDF values in voxels at a given resolution which is 2cm in the example (https://github.com/NVlabs/curobo/blob/c09d94908d8ad52f557ed59195c1ceb2d0434d65/examples/isaac_sim/realsense_reacher.py#L192 ).

To render the SDF in Isaac Sim, we generate a grid of spheres (radius==voxel_size) and then query the voxelized ESDF representation for which spheres are in collision (either at the surface of an obstacle or inside an obstacle) here: https://github.com/NVlabs/curobo/blob/c09d94908d8ad52f557ed59195c1ceb2d0434d65/examples/isaac_sim/realsense_reacher.py#L303

Then given this list of colliding spheres, we render them as cuboids in Isaac Sim using a voxel manager: https://github.com/NVlabs/curobo/blob/c09d94908d8ad52f557ed59195c1ceb2d0434d65/examples/isaac_sim/realsense_reacher.py#L259

Note that our query spheres in (2) are twice the size of voxel_size, this could be reduced to exactly match the size of voxels.

Since we voxelize the world and store as SDF, we can only render up to the voxel_size.

You should still only see a plane of cubes if the camera is pointing to an empty table. Did you change the camera pose in simulation?

Thanks Reply. Yes, I changed the camera pose in the simulator so that the converted point cloud can fit the desktop.

I understand using voxel to render, my problem is that the ESDF itself constructed from the depth map (which is supposed to be a thin plane) looks wrong. It seems that when building the ESDF, the large area behind the thin plane is also included in some form, causing voxels to appear in those places.

I found that if the point cloud is drawn directly using VoxelManager instead of the voxel obtained by get_voxels_in_bounding_box, the point cloud presents a thin plane.

But if I visualize the latter, I will see a large thick object composed of tiny cubes instead of a plane.

balakumar-s commented 8 months ago

Can you share an image of the voxelmanager rendering the pointcloud?

pmj110119 commented 8 months ago

Rendered voxel from get_voxels_in_bounding_box (voxel_size=0.01)： The thin plane expanded very thickly, causing the entire surface to be much higher than the "desktop" SI7ZOhPRQP

Rendered raw point cloud (voxel_size=0.001):

Both methods use the same data: camera_pose, depth, and camera intrinsic matrix.

balakumar-s commented 8 months ago

Is the camera static? If it is static, but you are moving the camera cube in sim, then this artifact can happen. Can you call world_model.clear_blox_layer('world') before every depth image addition. This will clear the previous ESDF data.
Can you try to query voxels with a smaller size? Let's try voxel_size=0.002 in this line https://github.com/NVlabs/curobo/blob/c09d94908d8ad52f557ed59195c1ceb2d0434d65/examples/isaac_sim/realsense_reacher.py#L303
Also change the render_voxel_size=0.002 so that the voxel size used for rendering cuboids is also small.

Note that get_voxels_in_bounding_box queries across all collision representations so you will see voxels for other obstacles as well like the ground and pillar in your image.

pmj110119 commented 8 months ago

Yes, the camera is static without moving in sim.

2&3. I add self.world_model.clear_blox_layer('world') before depth_image addition. And I set voxel_size=0.005, and downsample 1/20 voxels from get_voxels_in_bounding_box for visualization to avoid out-of-memory.

In order to focus on visualizing the depth map, I did not add the ground and pillar to curobo's WorldConfig, only added to isaac-sim.

pmj110119 commented 8 months ago

I have dumped the data, you can load and have a try if you are willing:

import pickle
with open('data_info.pkl', 'rb') as f:
    data = pickle.load(f)

depth_image = data['depth_image']
intrinsics = data['intrinsics']
pose = data['pose']

data_camera = CameraObservation(
                depth_image=depth, intrinsics=intrinsics, pose=camera_pose
            )

data_info.zip

balakumar-s commented 8 months ago

I can get a table + cube as seen in this mesh:

Can you run the below script and confirm that you are also getting a similar mesh as me?

import pickle
from curobo.types.camera import CameraObservation
from curobo.geom.sdf.world import CollisionCheckerType, WorldConfig
from curobo.wrap.reacher.motion_gen import MotionGen, MotionGenConfig
from curobo.geom.types import Cuboid

def save_nvblox_mesh():
    # create instance of motion_gen:
    world_cfg = WorldConfig.from_dict(
        {
            "blox": {
                "world": {
                    "pose": [0, 0, 0, 1, 0, 0, 0],
                    "integrator_type": "tsdf",
                    "voxel_size": 0.005,
                }
            }
        }
    )
    motion_gen_config = MotionGenConfig.load_from_robot_config(
        "franka.yml",
        world_cfg,
        collision_checker_type=CollisionCheckerType.BLOX,
    )
    mg = MotionGen(motion_gen_config)
    with open('data_info.pkl', 'rb') as f:
        data = pickle.load(f)

    depth_image = mg.tensor_args.to_device(data['depth_image'])
    intrinsics = data['intrinsics']
    pose = data['pose']
    data_camera = CameraObservation(
                    depth_image=depth_image, intrinsics=intrinsics, pose=pose
                )
    mg.add_camera_frame(data_camera, "world")
    mg.process_camera_frames("world",False)
    mg.world_coll_checker.update_blox_hashes()

    coll_mesh = mg.world_coll_checker.get_mesh_in_bounding_box(
                        Cuboid(name="test", pose=[0, 0, 0, 1, 0, 0, 0], dims=[1, 1, 1]),
                        voxel_size=0.005,
                    )
    coll_mesh.save_as_mesh("world_objects.stl")

if __name__ == '__main__':
    save_nvblox_mesh()

balakumar-s commented 8 months ago

Yes, the camera is static without moving in sim.

2&3. I add self.world_model.clear_blox_layer('world') before depth_image addition. And I set voxel_size=0.005, and downsample 1/20 voxels from get_voxels_in_bounding_box for visualization to avoid out-of-memory.

In order to focus on visualizing the depth map, I did not add the ground and pillar to curobo's WorldConfig, only added to isaac-sim.

Looking at your voxel visualization, this can happen if your depth is very noisy. Can I get a stream of images in a pickle file? That will help me debug what's going on.

balakumar-s commented 8 months ago

One other thing I observed is that your cube has very low height compared to the table. So reducing the voxel size in nvblox can help. Try 0.005 for the voxel size here: https://github.com/NVlabs/curobo/blob/c09d94908d8ad52f557ed59195c1ceb2d0434d65/examples/isaac_sim/realsense_reacher.py#L192

pmj110119 commented 8 months ago

Yes, the camera is static without moving in sim.

2&3. I add self.world_model.clear_blox_layer('world') before depth_image addition. And I set voxel_size=0.005, and downsample 1/20 voxels from get_voxels_in_bounding_box for visualization to avoid out-of-memory. In order to focus on visualizing the depth map, I did not add the ground and pillar to curobo's WorldConfig, only added to isaac-sim.

Looking at your voxel visualization, this can happen if your depth is very noisy. Can I get a stream of images in a pickle file? That will help me debug what's going on.

Thanks Reply! ! In fact, I am currently just recycling the single frame data (the pkl I provided above), so there should be no depth disturbance in timing.

pmj110119 commented 8 months ago

I can get a table + cube as seen in this mesh:

Can you run the below script and confirm that you are also getting a similar mesh as me?

import pickle
from curobo.types.camera import CameraObservation
from curobo.geom.sdf.world import CollisionCheckerType, WorldConfig
from curobo.wrap.reacher.motion_gen import MotionGen, MotionGenConfig
from curobo.geom.types import Cuboid

def save_nvblox_mesh():
    # create instance of motion_gen:
    world_cfg = WorldConfig.from_dict(
        {
            "blox": {
                "world": {
                    "pose": [0, 0, 0, 1, 0, 0, 0],
                    "integrator_type": "tsdf",
                    "voxel_size": 0.005,
                }
            }
        }
    )
    motion_gen_config = MotionGenConfig.load_from_robot_config(
        "franka.yml",
        world_cfg,
        collision_checker_type=CollisionCheckerType.BLOX,
    )
    mg = MotionGen(motion_gen_config)
    with open('data_info.pkl', 'rb') as f:
        data = pickle.load(f)

    depth_image = mg.tensor_args.to_device(data['depth_image'])
    intrinsics = data['intrinsics']
    pose = data['pose']
    data_camera = CameraObservation(
                    depth_image=depth_image, intrinsics=intrinsics, pose=pose
                )
    mg.add_camera_frame(data_camera, "world")
    mg.process_camera_frames("world",False)
    mg.world_coll_checker.update_blox_hashes()

    coll_mesh = mg.world_coll_checker.get_mesh_in_bounding_box(
                        Cuboid(name="test", pose=[0, 0, 0, 1, 0, 0, 0], dims=[1, 1, 1]),
                        voxel_size=0.005,
                    )
    coll_mesh.save_as_mesh("world_objects.stl")

if __name__ == '__main__':
    save_nvblox_mesh()

Thank you very much for the visual code.

I tried your world_cfg, but it will report error: Segmentation fault

So I exported the stl file using the code you provided on my previous configuration and there was indeed some problem:

My world_cfg is:

world_cfg = WorldConfig.from_dict(
            {
                "blox": {
                    "world": {
                        "pose": [0, 0, 0, 1, 0, 0, 0],
                        "integrator_type": "occupancy",
                        "voxel_size": 0.01,
                    }
                }
            }
        )

I change integrator_type to 'tsdf', and the mesh changes:

It seems right. Thanks again!!!

So I need to change integrator_type to 'tsdf' rather than 'occupancy', I want to know what are the consequences of using 'tsdf'? (Because all realsense-examples use 'occupancy'.)

balakumar-s commented 8 months ago

Previously tsdf integrator did not support dynamic scenes. We recently upgraded cuRobo to use 0.0.5 version of nvblox which adds dynamic scene support for tsdf. We haven't updated the examples yet and will update in the next release.

Generally we have found tsdf to work more reliably than occupancy with single view images.

This issue captures some good information on how to debug nvblox + curobo. If your issue is resolved close this issue and we will convert it to a discussion so others might find it useful.

pmj110119 commented 8 months ago

I understand. Looking forward to future updates on your great work!

NVlabs / curobo

How to use RGB-D depth map with cuRobo? #92