IntelRealSense / librealsense

Intel® RealSense™ SDK
https://www.intelrealsense.com/
Apache License 2.0
7.63k stars 4.83k forks source link

D455 Depth- 3D coordinates (x,y,z) differs from python and realsense viewer #13492

Open Prefiro-Prathik opened 2 weeks ago

Prefiro-Prathik commented 2 weeks ago

Greetings!

Required Info
Camera Model 2 multicameras: { D455 }
Firmware Version (Open RealSense Viewer --> latest
Operating System & Version {Win (11)
Platform PC
SDK Version latest
Language python

Issue Description

I'm using 2 D455 cameras that are 120 degrees apart but tilted to focus on the same region. I require the 3D points from each camera and then I transform them to a center coordinate system using a transformation matrix(transformation is irrelevant to the problem).

When I try to use the Depth 3D coordinates from the intel real sense viewer, I get 3D points from each camera which are completely different from the 3D points I get from the python, when mouse-clicking on the same object while streaming. Below is the python code: ''' import pyrealsense2 as rs import numpy as np import cv2 import json

Load transformation matrices for each camera from provided text files

def load_transformation_matrix(file_path: str) -> np.ndarray: return np.loadtxt(file_path).reshape(4, 4)

transformation_matrix_cam0 = load_transformation_matrix("d455_camera0_transformation.txt") transformation_matrix_cam1 = load_transformation_matrix("d455_camera1_transformation.txt")

Initialize both RealSense pipelines

pipeline_0 = rs.pipeline() pipeline_1 = rs.pipeline() config_0 = rs.config() config_1 = rs.config()

config_0.enable_device('234222303235') # Serial number for Camera 0 config_1.enable_device('241122302541') # Serial number for Camera 1

Increase resolution and enable both color and depth streams

config_0.enable_stream(rs.stream.depth, 640, 480, rs.format.z16, 30) config_0.enable_stream(rs.stream.color, 640, 480, rs.format.bgr8, 30) config_1.enable_stream(rs.stream.depth, 640, 480, rs.format.z16, 30) config_1.enable_stream(rs.stream.color, 640, 480, rs.format.bgr8, 30)

Start pipelines

profile_0 = pipeline_0.start(config_0) profile_1 = pipeline_1.start(config_1)

Function to apply JSON settings to sensors

def apply_settings_from_json(device, json_file): with open(json_file, 'r') as f: settings = json.load(f)

for sensor in device.sensors:
    for option_name, option_value in settings.items():
        try:
            option = getattr(rs.option, option_name)
            if sensor.supports(option):
                sensor.set_option(option, option_value)
                print(f"Set {option_name} to {option_value} on {sensor.get_info(rs.camera_info.name)}")
        except AttributeError:
            print(f"Option {option_name} not found.")

Apply JSON settings to both devices

apply_settings_from_json(profile_0.get_device(), "cameraright.json")

apply_settings_from_json(profile_1.get_device(), "cameraright.json")

Depth scaling

depth_scale_0 = profile_0.get_device().first_depth_sensor().get_depth_scale() 1000 # Convert to mm depth_scale_1 = profile_1.get_device().first_depth_sensor().get_depth_scale() 1000 # Convert to mm

Intrinsics and rectification

intrinsics_0 = profile_0.get_stream(rs.stream.color).as_video_stream_profile().get_intrinsics() intrinsics_1 = profile_1.get_stream(rs.stream.color).as_video_stream_profile().get_intrinsics()

map1_0, map2_0 = cv2.initUndistortRectifyMap( np.array([[intrinsics_0.fx, 0, intrinsics_0.ppx], [0, intrinsics_0.fy, intrinsics_0.ppy], [0, 0, 1]]), np.array(intrinsics_0.coeffs), None, None, (intrinsics_0.width, intrinsics_0.height), cv2.CV_16SC2)

map1_1, map2_1 = cv2.initUndistortRectifyMap( np.array([[intrinsics_1.fx, 0, intrinsics_1.ppx], [0, intrinsics_1.fy, intrinsics_1.ppy], [0, 0, 1]]), np.array(intrinsics_1.coeffs), None, None, (intrinsics_1.width, intrinsics_1.height), cv2.CV_16SC2)

Transformation function

def apply_transformation(point: np.ndarray, transformation_matrix: np.ndarray) -> np.ndarray: point_homogeneous = np.append(point, 1) transformed_point = transformation_matrix @ point_homogeneous return transformed_point[:3] / transformed_point[3]

Mouse callback functions

def mouse_callback_0(event, x, y, flags, param): if event == cv2.EVENT_LBUTTONDOWN: depth_value_mm = depth_image_0[y, x] depth_scale_0 point_3d_m = rs.rs2_deproject_pixel_to_point(intrinsics_0, [x, y], depth_value_mm / 1000) point_3d_mm = np.array(point_3d_m) 1000 transformed_point_mm = apply_transformation(point_3d_mm, transformation_matrix_cam0) print(f"Camera 0 - Original Point (mm): {point_3d_mm}, Transformed Point (mm): {transformed_point_mm}")

def mouse_callback_1(event, x, y, flags, param): if event == cv2.EVENT_LBUTTONDOWN: depth_value_mm = depth_image_1[y, x] depth_scale_1 point_3d_m = rs.rs2_deproject_pixel_to_point(intrinsics_1, [x, y], depth_value_mm / 1000) point_3d_mm = np.array(point_3d_m) 1000 transformed_point_mm = apply_transformation(point_3d_mm, transformation_matrix_cam1) print(f"Camera 1 - Original Point (mm): {point_3d_mm}, Transformed Point (mm): {transformed_point_mm}")

Set up windows and callbacks

cv2.namedWindow('Camera 0 Stream') cv2.namedWindow('Camera 1 Stream') cv2.setMouseCallback('Camera 0 Stream', mouse_callback_0) cv2.setMouseCallback('Camera 1 Stream', mouse_callback_1)

try: while True: frames_0 = pipeline_0.wait_for_frames() frames_1 = pipeline_1.wait_for_frames()

    depth_frame_0 = frames_0.get_depth_frame()
    color_frame_0 = frames_0.get_color_frame()
    depth_frame_1 = frames_1.get_depth_frame()
    color_frame_1 = frames_1.get_color_frame()

    if not depth_frame_0 or not color_frame_0 or not depth_frame_1 or not color_frame_1:
        continue

    # Convert to numpy arrays
    depth_image_0 = np.asanyarray(depth_frame_0.get_data())
    color_image_0 = np.asanyarray(color_frame_0.get_data())
    depth_image_1 = np.asanyarray(depth_frame_1.get_data())
    color_image_1 = np.asanyarray(color_frame_1.get_data())

    # Apply undistortion
    color_image_0 = cv2.remap(color_image_0, map1_0, map2_0, cv2.INTER_LINEAR)
    color_image_1 = cv2.remap(color_image_1, map1_1, map2_1, cv2.INTER_LINEAR)

    # Display color images
    cv2.imshow('Camera 0 Stream', color_image_0)
    cv2.imshow('Camera 1 Stream', color_image_1)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

finally: pipeline_0.stop() pipeline_1.stop() cv2.destroyAllWindows() ''' I want to make sure I get the same coordinates in python and real sense viewer. I manually measured, and the real sense viewer seems to provide the correct 3D measurements from each of the cameras. I experimented with the parameters in the realsense viewer, do these parameters remain with the same value when accessing the cameras through python?

Please help me find the correct python parameters to ensure the depth information I have from the python code is correct and accurate. Let me know if you need further information, thank you.

MartyG-RealSense commented 2 weeks ago

Hi @Prefiro-Prathik The RealSense Viewer tool applies a range of post-processing filters to the depth data by default, whilst a Python script applies no filters by default as they have to be deliberately programmed into the script. So you could left-click on the blue icon next to 'Post-Processing' in the Stereo Module section of the Viewer's side-panel to turn the icon to red (all filters disabled). You could then check the 3D values to see if they are the same of different, and therefore confirm whether the filters have an effect on the correctness of the values.

Was the configuration json that you are using exported from the RealSense Viewer? If it was then it is worth bearing in mind that not all of the Viewer's settings are exported to a json, so there may be some settings where the Python program applies the default values instead of values that you customized in the Viewer.

I note that in your Python script you set the streams to 640x480 resolution and 30 FPS. Are you using the same resolution and FPS settings in the Viewer, please?

In your definitions of depth_scale_0 and depth_scale_1, I would not recommend multiplying the depth scale (0.001) by 1000. Instead, wait for the final distance value in meters to be calculated and then divide the result by 1000 to get the mm distance value (which you seem to be doing in your depth_value_m lines).

MartyG-RealSense commented 6 days ago

Hi @Prefiro-Prathik Do you require further assistance with this case, please? Thanks!