IntelRealSense / realsense-ros

ROS Wrapper for Intel(R) RealSense(TM) Cameras
http://wiki.ros.org/RealSense
Apache License 2.0
2.51k stars 1.74k forks source link

2D pixel and 3D pointcloud coordinates dismatch #3180

Closed HRItdy closed 1 week ago

HRItdy commented 1 month ago

Required Info
Camera Model { D435 }
Firmware Version (Open RealSense Viewer --> Click info)
Operating System & Version Linux (Ubuntu 22)
Kernel Version (Linux Only) (e.g. 5.4)
Platform PC
Librealsense SDK Version { 2.<?>.<?> }
Language {python }
Segment {Robots }
ROS Distro {Neotic }
RealSense ROS Wrapper Version {4.51.1, 4.54.1, etc..}

Issue Description

Hi! I'm using a realsense to do some object detection work. Basically, I used some object detection model to get the 2D coordinates of the object mask in the color image, and then project it to the pointcloud to get the corresponding 3D coordinates.

I used depth_image_proc/register and depth_image_proc/point_cloud_xyzrgb to do the alignment between the color and depth, and the result is quite good: 1000023826

But after I get the mask and find the 3D coordinates, the segmented pointcloud deviates from the one I want (should be the left most red ball, but output is the one circled). 1000023828

My code is like following. Is there extra transformation between the original color image and the aligned color image? Thanks!

     def __init__(self):
        rospy.init_node('claw_depth', anonymous=True)
        self.depth_image = None
        self.info_sub = Subscriber('/realsense_wrist/depth/camera_info', CameraInfo)
        self.depth_sub = Subscriber('/aligned_depth_image', Image)
        self.depth_pub = rospy.Publisher('/depth_point_cloud', PointCloud2, queue_size=1)
        self.color_pub = rospy.Publisher('/color_point_cloud', PointCloud2, queue_size=1)
        self.seg_pub = rospy.Publisher('/segment_point_cloud', PointCloud2, queue_size=1)
        # Synchronize the topics
        self.ats = ApproximateTimeSynchronizer([self.depth_sub, self.info_sub], queue_size=5, slop=0.1)
        self.ats.registerCallback(self.callback)
        # Service server  TODO
        self.service = rospy.Service('get_depth', Centroid, self.handle_service)
        self.rate = rospy.Rate(10)  # 10 Hz
        rospy.spin()

    def callback(self, depth_image, camera_info):
        self.depth_image = np.frombuffer(depth_image.data, dtype=np.uint16).reshape(depth_image.height, depth_image.width)
        self.intrinsics = self.camera_register(camera_info)
        # get the mask
        mask, success = store_mask_client(store=False)
        indices = np.argwhere(mask)[2:, :].transpose(0, 1) # TODO: Check the coordinates after the transpose!
        self.seg_pc = self.points_to_point_cloud(indices, self.depth_image, self.intrinsics)
        self.pub_pc(self.seg_pc, self.seg_pub)
        print('done')

    def pixel_to_3d(self, pixel, depth_image, intrinsics):
        """
        Convert 2D pixel coordinates to 3D point coordinates in the camera frame.

        :param pixel: (u, v) tuple, 2D pixel coordinates
        :param depth_image: Depth image (numpy array)
        :param camera_info: CameraInfo message with camera intrinsic parameters
        :return: (x, y, z) tuple, 3D coordinates in the camera frame
        """
        u, v = pixel
        result = rs2.rs2_deproject_pixel_to_point(intrinsics, [u, v], float(depth_image[v, u]) * 0.001)  #result[0]: right, result[1]: down, result[2]: forward
        #return result[2], -result[0], -result[1]
        return result[0], result[1], result[2]

    def camera_register(self, cameraInfo):
        _intrinsics = rs2.intrinsics()
        _intrinsics.width = cameraInfo.width
        _intrinsics.height = cameraInfo.height
        _intrinsics.ppx = cameraInfo.K[2]
        _intrinsics.ppy = cameraInfo.K[5]
        _intrinsics.fx = cameraInfo.K[0]
        _intrinsics.fy = cameraInfo.K[4]
        if cameraInfo.distortion_model == 'plumb_bob':
            _intrinsics.model = rs2.distortion.brown_conrady
        elif cameraInfo.distortion_model == 'equidistant':
            _intrinsics.model = rs2.distortion.kannala_brandt4
        _intrinsics.coeffs = [i for i in cameraInfo.D]  
        return _intrinsics

    def points_to_point_cloud(self, indices, depth_image, intrinsics):
        point_cloud = []
        for indice in indices:
            x, y, z = self.pixel_to_3d(indice, depth_image, intrinsics)
            point_cloud.append([x, y, z])
        return np.array(point_cloud)

    def cluster_pub(self, pc, pub):
        # Create a PointCloud2 message from the points
        header = Header()
        header.stamp = rospy.Time.now()
        header.frame_id = 'realsense_wrist_link'
        point_cloud_msg = point_cloud2.create_cloud_xyz32(header, pc)
        pub.publish(point_cloud_msg)

    def pub_pc(self, pc, pub):
        while not rospy.is_shutdown():
            self.cluster_pub(pc, pub)
            self.rate.sleep()
HRItdy commented 1 month ago

By the way unfortunately I cannot use realsense SDK to directly call align_to function to do the alignment, because there is allocation conflict between the ROS launch file and SDK.

MartyG-RealSense commented 1 month ago

Hi @HRItdy It is more usual for depth_image_proc to be used with the RealSense ROS1 wrapper to obtain a pointcloud rather than the ROS2 wrapper.

In the 'rs_rgbd.launch' ROS1 launch file that publishes its pointcloud to depth_image_proc, it sets align_depth to true, whilst usually in a ROS1 launch it is set to false by default. It also has a large amount of instructions for calculating an XYZRGB point cloud.

https://github.com/IntelRealSense/realsense-ros/blob/ros1-legacy/realsense2_camera/launch/rs_rgbd.launch#L182-L200

So it may be worth trying to enable alignment in your ROS2 launch. If you are publishing the RealSense topics with the rs_launch.py launch file then you can enable align_depth in the launch instruction. This will have the same effect as calling align_to in the RealSense SDK.

ros2 launch realsense2_camera rs_launch.py align_depth.enable:=true

HRItdy commented 1 month ago

Thanks for your prompt response! @MartyG-RealSense Actually the align_depth has been enabled in my rs_rgbd.launch. My issue is even with the aligned depth image, this code still cannot get the correct pointcloud (please refer to the image)

        u, v = pixel
        result = rs2.rs2_deproject_pixel_to_point(intrinsics, [u, v], float(depth_image[v, u]) * 0.001)  #result[0]: right, result[1]: down, result[2]: forward
        #return result[2], -result[0], -result[1]
        return result[0], result[1], result[2]

1000023828

In this case, my mask should overlay with the leftmost ball, but in the converted pointcloud, the corresponding point cloud is weird (circled). I doubt it's because some transformation is not correct. Do you have some off-the-shelf package or demo to instruct the projection from 2D to 3D coordinates? Any suggestion is appreciated!

MartyG-RealSense commented 1 month ago

In the specification listing at the top of this discussion your ROS wrapper version is listed as {4.51.1, 4.54.1, etc..} but I see that none of the information in that box has been edited from its defaults.

So can you confirm if you are actually using the ROS1 wrapper and the rs_rgbd launch file, please? Thanks!

HRItdy commented 1 month ago

Oh sorry, I forget to change this... I changed several but because the realsense is remotely connected, I will update the info later I have access to the remote machine. Yeah I'm using ROS1 wrapper and the rs_rgbd launch file. And the content of this launch is

<?xml version="1.0"?>
<launch>
    <arg name="wrist" default="true"/>
    <arg name="camera_name" value="realsense_wrist" if="$(arg wrist)"/>
    <arg name="camera_name" value="realsense_fixed" unless="$(arg wrist)"/>

    <include file="$(find realsense2_camera)/launch/rs_rgbd.launch">
        <arg name="camera" value="$(arg camera_name)"/>
        <arg name="tf_prefix" default="$(arg camera_name)"/>

        <arg name="enable_fisheye" default="false"/>
        <arg name="enable_gyro" default="false"/>
        <arg name="enable_accel" default="false"/>
        <arg name="enable_depth" default="true"/>
        <arg name="enable_infra1" default="true"/>
        <arg name="enable_infra2" default="true"/>
        <arg name="enable_color" default="true"/>
        <arg name="enable_pointcloud" default="true"/>

        <arg name="depth_fps" default="30"/>
        <arg name="infra_fps" default="30"/>
        <arg name="color_fps" default="30"/>

        <arg name="align_depth" default="true"/>
        <arg name="filters" default="decimation,disparity,spatial,temporal"/>
        <arg name="depth_width" default="640"/>
        <arg name="depth_height" default="480"/>
        <arg name="infra_width" default="640"/>
        <arg name="infra_height" default="480"/>
        <arg name="color_width" default="640"/>
        <arg name="color_height" default="480"/>
    </include>
</launch>
MartyG-RealSense commented 1 month ago

If you are using 640x480 resolution then it may be worth removing the Decimation filter. This filter 'downsamples' the depth resolution to half of the one that has been set, so the filter will reduce the pixel resolution on the depth image from 640x480 to 320x240.

HRItdy commented 1 month ago

Thanks for your remind! I have removed this filter, but the segmented point cloud is still weird... Is there any suggestion on how to precisely project 2D pixel coordinates to 3D? Thanks!

MartyG-RealSense commented 1 month ago

The ROS1 wrapper has a Python node script called 'show_center_depth.py' that converts the 2D coordinates into a 3D depth value. A node script is launched from the ROS terminal after the ROS wrapper has been launched.

https://github.com/IntelRealSense/realsense-ros/blob/ros1-legacy/realsense2_camera/scripts/show_center_depth.py

This is perhaps not what you have in mind though if you would prefer to do everything within the launch file instead of using an external script.

In regard to your mention of depth_image_proc/register, 'registration' means to align depth and color images together. As align_depth is already doing this, you could try either not using depth_image_proc/register or setting align_depth to false to see what happens.

HRItdy commented 1 month ago

Thanks! I will try this script ASAP. An external script works for me! By the way, I'm very new to the RGBD camera, I would like to ask, after we do the depth_to_color_alignment, the depth image is overplayed with the color image right? Then if I want to get the 3D coordinate of one 2D pixel, say [224, 125], I can get it by substituting this

https://github.com/IntelRealSense/realsense-ros/blob/b14ce433d1cccd2dfb1ab316c0ff1715e16ab961/realsense2_camera/scripts/show_center_depth.py#L28

with pix = (224, 125) right? Shall I do some other revision like change the subscribe topic to the aligned depth image? https://github.com/IntelRealSense/realsense-ros/blob/b14ce433d1cccd2dfb1ab316c0ff1715e16ab961/realsense2_camera/scripts/show_center_depth.py#L81 Thanks!

MartyG-RealSense commented 1 month ago

When align_depth is enabled, the depth image is matched to the color image coordinates. The depth image's 'field of view size' is resized to match the color image's field of view size. If a D435 type RealSense camera is being used (D435, D435i, D435f, etc) then this alignment causes the outer edges of the depth image to be excluded on the aligned image. This is because the D435 type cameras have a smaller field of view on the color sensor and so cannot see as much of the scene as the depth sensor.

On the D455 type cameras, they have a wide field of view and field of view sizes that are almost the same, so the amount of edge information that gets cut off from the aligned image is minimal.

Yes, I would recommend switching to the aligned topic, as show_center_depth.py does not use the color stream in the script's default state.

HRItdy commented 1 month ago

Hi! @MartyG-RealSense Really thanks for your explain! I tried the show_center_depth.py script, and the original code had some issue so I slightly revised it as:

def imageDepthCallback(self, data):
        depth_image = np.frombuffer(data.data, dtype=np.uint16).reshape(data.height, data.width)
        # cv_image = self.bridge.imgmsg_to_cv2(data, str(data.encoding))
        # pick one pixel among all the pixels with the closest range:
        # indices = np.array(np.where(cv_image == cv_image[cv_image > 0].min()))[:,0]
        mask, success = store_mask_client(store=False)
        indices = np.argwhere(mask)[2:, :].transpose(0, 1) 
        points = []
        for indice in indices:
            pix = (indice[1], indice[0])
            self.pix = pix
            if self.intrinsics:
                depth = depth_image[pix[1], pix[0]] * 0.001
                result = rs2.rs2_deproject_pixel_to_point(self.intrinsics, [pix[0], pix[1]], depth)
                points.append(result)
        pc = np.array(points)
        self.pub_pc(pc, self.pub)

And I have changed the subscribe topic:

depth_image_topic = '/realsense_wrist/aligned_depth_to_color/image_raw'

Unfortunately in the published result the segmented pointcloud still doesn't assemble with the groundtruth: Screenshot 2024-08-14 14:54:58

The upper part is the segmented pointcloud according to the mask. The mask is correct on the color frame. And in the code I tried both pix = (indice[1], indice[0]) and pix = (indice[0], indice[1]) but the result are similar, the pointcloud didn't assemble with the groundtruth. Is there any debug idea? Any suggestion is appreciated. Thanks!

MartyG-RealSense commented 4 weeks ago

Are you able to move the black equipment at the bottom of the color image out of the camera's view? Black or dark gray objects are difficult for a camera to obtain depth information from as they absorb light, and the result is an area of empty black without any depth values in the approximate area where the black / dark gray surface is in. This depth-empty area could be confusing the camera regarding its close proximity to the separated ball.

The black area only looks as though it has depth because it is shaped like the black object. For example, a black USB cable will appear as a cable-shaped area on the depth image but there is no data in that area.

HRItdy commented 3 weeks ago

Thanks for your reply! @MartyG-RealSense Unfortunately I cannot move the gripper out of the camera's view, but I think I found the reason why the segmented points are not assembled with the entire pointcloud. The pointcloud of realsense seems to be published in a different frame with camera_link frame. After I did the transformation manually, the segmented part approached to the entire pointcloud. screenshot_2 screenshot_1

But there is still one problem that why there is a deviation between the segmented part and the entire pointcloud? Is there any function I can use to eliminate this? Thanks!

MartyG-RealSense commented 3 weeks ago

Is it the black holes on the image that you have problems with? If it is and you are using ROS1 Noetic then you could try applying the hole_filling filter to fill in black holes.

roslaunch realsense2_camera rs_camera.launch filters:=hole_filling

If the deviation that you describe relates to something else other than the black holes then please let me know.

HRItdy commented 3 weeks ago

Hi @MartyG-RealSense Sorry I didn't make it clear. image

In this image, the red part is the segmented out pointcloud part, which is generated by the projecting the 2D pixels corresponding to the yellow part to the 3D pointcloud coordinate. So the red part should be overlapped with the yellow part. But for now there is a deviation between them. And the code I used is similar with the one you suggested:

https://github.com/IntelRealSense/realsense-ros/blob/ros1-legacy/realsense2_camera/scripts/show_center_depth.py

depth_image = np.frombuffer(data.data, dtype=np.uint16).reshape(data.height, data.width)
mask = np.array(rospy.get_param('/pc_transform/image_mask'))
indices = np.argwhere(mask)[2:, :].transpose(0, 1) 
points = []
for indice in indices:
    pix = (indice[1], indice[0])
    self.pix = pix
    if self.intrinsics:
        depth = depth_image[pix[1], pix[0]] * 0.001
        result = rs2.rs2_deproject_pixel_to_point(self.intrinsics, [pix[0], pix[1]], depth)
        points.append([result[2], -result[0], -result[1]])

Any suggestion is appreciated!

MartyG-RealSense commented 3 weeks ago

camera_link corresponds to the left infrared sensor of the camera, which is the origin point of depth.

When depth is aligned to color though, the origin of depth changes from the centerpoint of the left infrared sensor to the centerpoint of the RGB sensor, which is horizontally offset from the position of the left IR sensor on the front of the camera. So in this situation when projecting 2D pixels to 3D points, aligned intrinsics or color intrinsics are used instead of depth intrinsics.

If your manual adjustment involves intrinsics then it may be worth checking whether your adjustment code is using depth intrinsics instead of color or aligned intrinsics.

image

HRItdy commented 3 weeks ago

Thanks for the suggestion! I changed to color and align intrinsic parameters and there is still deviation between the target pointcloud and the entire one:

screenshot_3 png

MartyG-RealSense commented 3 weeks ago

Are you able to access the RealSense Viewer tool? If you are then please next try resetting the calibration of your camera to its factory-new default settings using the instructions at https://github.com/IntelRealSense/librealsense/issues/10182#issuecomment-1019854487 in order to eliminate the possibility that your camera sensors have become mis-calibrated. This could occur if there is a physical shock to the camera such as a hard knock, a drop to the ground or severe vibration.

MartyG-RealSense commented 2 weeks ago

Hi @HRItdy Do you require further assistance with this case, please? Thanks!

MartyG-RealSense commented 1 week ago

Case closed due to no further comments received.