google-deepmind / mujoco

Multi-Joint dynamics with Contact. A general purpose physics simulator.
https://mujoco.org
Apache License 2.0
8.19k stars 818 forks source link

Best Way to Create Point Clouds in MuJoCo? #1863

Closed jonzamora closed 2 months ago

jonzamora commented 3 months ago

Hi,

I'm a Master's Student at USC, and i'm using MuJoCo for Simulating Robot Manipulation tasks.

I'm looking for help with the Best Way to Create Point Clouds in MuJoCo.

Currently, I understand how to extract rgb and depth using code like below:

with mujoco.Renderer(model) as renderer:
  # update renderer to render depth
  renderer.enable_depth_rendering()

  # reset the scene
  renderer.update_scene(data)

  # depth is a float array, in meters.
  depth = renderer.render()

  # Shift nearest values to the origin.
  depth -= depth.min()
  # Scale by 2 mean distances of near rays.
  depth /= 2*depth[depth <= 1].mean()
  # Scale to [0, 255]
  pixels = 255*np.clip(depth, 0, 1)

  media.show_image(pixels.astype(np.uint8))

I'm mostly wondering what is the most "free" way to create Point Clouds in MuJoCo (e.g. with built-in sensors), or if we need to use an external package like Open3D to construct Point Clouds from rgbd?

For reference, I came across this discussion https://github.com/google-deepmind/mujoco/discussions/688, which seems relevant to this question.

kevinzakka commented 2 months ago

Hi @jonzamora, while I think one shouldn't reinvent the wheel, creating a function that takes in an RGB-D pair and outputs a point cloud without any external dependencies should be pretty straightforward. That being said, Open3D is probably faster, well tested and has some additional useful features (e.g outlier removal, tsdf-fusion, etc.) so might be worth using as well.

kevinzakka commented 2 months ago

Made a tutorial script for you! We only use open3d for visualization, everything else is mujoco / python only.

https://github.com/user-attachments/assets/db5ccee3-fddb-4dcf-8ddc-ab00bd3e2541

Click for code ```python import mujoco import mujoco.viewer from mujoco.renderer import Renderer import numpy as np import open3d as o3d XML = """ """ if __name__ == "__main__": model = mujoco.MjModel.from_xml_string(XML) data = mujoco.MjData(model) mujoco.mj_forward(model, data) # Define camera parameters and init renderer. height = 480 width = 640 fps = 30 camera_id = model.cam("camera").id renderer = Renderer(model, height=height, width=width) # Intrinsic matrix. fov = model.cam_fovy[camera_id] theta = np.deg2rad(fov) fx = width / 2 / np.tan(theta / 2) fy = height / 2 / np.tan(theta / 2) cx = (width - 1) / 2.0 cy = (height - 1) / 2.0 intr = np.array([[fx, 0, cx], [0, fy, cy], [0, 0, 1]]) # Extrinsic matrix. cam_pos = data.cam_xpos[camera_id] cam_rot = data.cam_xmat[camera_id].reshape(3, 3) extr = np.eye(4) extr[:3, :3] = cam_rot.T extr[:3, 3] = cam_pos def render_rgbd( renderer: mujoco.Renderer, camera_id: int, ) -> tuple[np.ndarray, np.ndarray]: renderer.update_scene(data, camera=camera_id) renderer.enable_depth_rendering() depth = renderer.render() renderer.disable_depth_rendering() rgb = renderer.render() return rgb, depth def rgbd_to_pointcloud( rgb: np.ndarray, depth: np.ndarray, intr: np.ndarray, extr: np.ndarray, depth_trunc: float = 20.0, ): cc, rr = np.meshgrid(np.arange(width), np.arange(height), sparse=True) valid = (depth > 0) & (depth < depth_trunc) z = np.where(valid, depth, np.nan) x = np.where(valid, z * (cc - intr[0, 2]) / intr[0, 0], 0) y = np.where(valid, z * (rr - intr[1, 2]) / intr[1, 1], 0) xyz = np.vstack([e.flatten() for e in [x, y, z]]).T color = rgb.transpose([2, 0, 1]).reshape((3, -1)).T / 255.0 mask = np.isnan(xyz[:, 2]) xyz = xyz[~mask] color = color[~mask] xyz_h = np.hstack([xyz, np.ones((xyz.shape[0], 1))]) xyz_t = (extr @ xyz_h.T).T xyzrgb = np.hstack([xyz_t[:, :3], color]) return xyzrgb # Simulate for 10 seconds and capture RGB-D images at fps Hz. xyzrgbs: list[np.ndarray] = [] mujoco.mj_resetData(model, data) while data.time < 10.0: mujoco.mj_step(model, data) if len(xyzrgbs) < data.time * fps: rgb, depth = render_rgbd(renderer, camera_id) xyzrgb = rgbd_to_pointcloud(rgb, depth, intr, extr) xyzrgbs.append(xyzrgb) # Visualize in open3d. vis = o3d.visualization.Visualizer() vis.create_window() pcd = o3d.geometry.PointCloud() pcd.points = o3d.utility.Vector3dVector(xyzrgbs[0][:, :3]) pcd.colors = o3d.utility.Vector3dVector(xyzrgbs[0][:, 3:]) vis.add_geometry(pcd) frame = o3d.geometry.TriangleMesh.create_coordinate_frame(size=0.6) vis.add_geometry(frame) counter: int = 1 def update_pc(vis): global counter if counter < len(xyzrgbs) - 1: pcd.points = o3d.utility.Vector3dVector(xyzrgbs[counter][:, :3]) pcd.colors = o3d.utility.Vector3dVector(xyzrgbs[counter][:, 3:]) vis.update_geometry(pcd) counter += 1 vis.register_animation_callback(update_pc) vis.run() vis.destroy_window() renderer.close() ```
jonzamora commented 2 months ago

This is awesome, thanks for your help Kevin! @kevinzakka 🔥🙏