IntelRealSense / librealsense

Intel® RealSense™ SDK
https://www.intelrealsense.com/
Apache License 2.0
7.59k stars 4.82k forks source link

[L515] increasing offsets in x-y real-world-coordinates #9749

Closed Barney1337 closed 3 years ago

Barney1337 commented 3 years ago
Required Info
Camera Model L515
Firmware Version 01.05.08.01
Operating System & Version Win 10
Kernel Version (Linux Only) -
Platform PC
SDK Version 2.49.0.3474
Language python
Segment others

Issue Description

Dear Community,

I am using the L515 to retrieve real-world-coordinates from clicking pixels in the RGB image. The coordinates are then used to control an x-y-axis-system pointing a laser to the exact coordinate. To align the 2 coordinate-systems (camera and axis-system) I retrieve x=0,y=0 from the laser/axis-system on a surface and get the coordinate by clicking it in the RGB image. After that, I am transforming the clicked coordinates from the RGB image to the axis-system-coordinates by subtracting: camera x,y(axis (0,0)) - clicked (x,y) (I will provide my code and an image below, so it might be more understandable)

image1

The issue: The further away from the axis0,0 I click a pixel, the bigger is the offset (up to ~10mm) of the calculated real-world-coordinates.

Since I copy&pasted big parts of the pyrealsense2 code without fully understanding it, the offset might be related to wrong programming. I hope someone can spot the problem and is able to help me fixing it.

# This Python file uses the following encoding: utf-8
import os
import glob
import sys

import pyrealsense2 as rs
import numpy as np
import time
import requests
import cv2
from os import path
from datetime import datetime
import gthread_realsense as gt
import camaxiscalib as caxcal

class GUI(QMainWindow):

    def __init__(self):
        super(GUI,self).__init__()
        loadUi("form.ui",self)

        self.caxcalthread = caxcal.CamAxisCalibThread()
        self.gthread = gt.Gthread()
        self.maxcoord2 = [376,643]
        self.manualworldcalib = [round((0.18931545317173004*1000),1), round((0.011993760243058205*1000),1)]

        self.button_marking.clicked.connect(self.MarkingClicked)
        self.pixel_list = []
        self.coord_list = []

    def MarkingClicked(self):
        self.button_marking.setStyleSheet("background-color : rgb(255,120,120)")

        self.markingradius = 5
        self.markingcolor = [0,255,255]

        pipeline = rs.pipeline()
        config = rs.config()
        config.enable_stream(rs.stream.depth, 1024, 768, rs.format.z16, 30)
        config.enable_stream(rs.stream.color, 1920, 1080, rs.format.bgr8, 30)

        # Align objects
        align_to = rs.stream.color
        align = rs.align(align_to)

        # Start streaming
        pipeline.start(config)

        def click_event(event, c, r, flags, param):
            if event == cv2.EVENT_LBUTTONDOWN:
                depth = depth_frame.get_distance(c, r)
                depth_point = rs.rs2_deproject_pixel_to_point(depth_intrin, [c, r], depth)
                y = self.manualworldcalib[0]-int(depth_point[0]*1000)               # x and y are swapped due to different axis-coordinatesystem
                x = self.manualworldcalib[1]-int(depth_point[1]*1000)               # x and y are swapped due to different axis-coordinatesystem
                z = int(depth_point[2]*1000)

                print(c,r)

                if x < 0 or x > self.maxcoord2[0] or y < 0 or y > self.maxcoord2[1]:
                    print('Point outside maximum working range')

                else:
                    if z != 0:
                        self.coord_list.append([x,y])
                        self.pixel_list.append([c,r])
                    else:
                        print('No depth coordinate for this pixel')

        cv2.namedWindow('RealSense', cv2.WINDOW_AUTOSIZE)
        cv2.setMouseCallback('RealSense', click_event)

        try:
            while True:
                # Wait for a coherent pair of frames: depth and color
                frames = pipeline.wait_for_frames()
                aligned_frames = align.process(frames)
                depth_frame = aligned_frames.get_depth_frame()
                color_frame = aligned_frames.get_color_frame()

                if not depth_frame or not color_frame:
                    continue

                # Intrinsics & Extrinsics
                depth_intrin = depth_frame.profile.as_video_stream_profile().intrinsics
#                color_intrin = color_frame.profile.as_video_stream_profile().intrinsics
#                depth_to_color_extrin = depth_frame.profile.get_extrinsics_to(color_frame.profile)

                # Convert images to numpy arrays
#                depth_image = np.asanyarray(depth_frame.get_data())
                color_image = np.asanyarray(color_frame.get_data())

                # Apply colormap on depth image (image must be converted to 8-bit per pixel first)
#                depth_colormap = cv2.applyColorMap(cv2.convertScaleAbs(depth_image, alpha=0.03), cv2.COLORMAP_JET)
#                depth_colormap_dim = depth_colormap.shape
#                color_colormap_dim = color_image.shape

                croppedimages = np.copy(color_image)
                self.captureimage = np.copy(color_image)

                for point in self.pixel_list:
                    cv2.circle(croppedimages,tuple(point),self.markingradius,self.markingcolor,2)

                # Show images
                cv2.imshow('RealSense', croppedimages)

                if cv2.waitKey(1) & 0xFF == ord('q'):
                    break

            cv2.destroyAllWindows()

        finally:
            # Stop streaming
            pipeline.stop()

app =  QApplication(sys.argv)
window = GUI()
window.show()
try:
    sys.exit(app.exec_())
except:
    print("done")

Thanks in advance!

MartyG-RealSense commented 3 years ago

Hi @Barney1337 When using depth-color alignment and the get_distance instruction, drift in the accuracy of measurements can increase as the 0,0 coordinate is moved away from.

Could you check whether your results improve if you use the resolution 1280x720 for the color stream instead of 1920x1080, plese.

Barney1337 commented 3 years ago

Hi @Barney1337 When using depth-color alignment and the get_distance instruction, drift in the accuracy of measurements can increase as the 0,0 coordinate is moved away from.

Could you check whether your results improve if you use the resolution 1280x720 for the color stream instead of 1920x1080, plese.

Hi @MartyG-RealSense , thanks for your fast answer. I changed the RGB resolution to 1280x720, but didn't see any noticeable changes in the accuracy. While checking the results outside my piece of code using the RealSense-Viewer, I noticed something even worse: Changing the height of the measured point results in an even bigger error in x-y which I also can't explain, since the measured depth-results seem to be accurate. I attached 2 pictures to display the problem.

Isn't the pixel_to_point function especially supposed to calculate the same x-y-plane (camera x-y) on different levels by using the depth information?

EDIT: ignore the issue concerning the offset on different heights - I did a big mental lapse

MartyG-RealSense commented 3 years ago

What is the nature of your laser, please? If it was infrared-based and it projects a beam that crosses the L515's field of view then that in itself could throw off the L515's depth readings, as this camera model is vulnerable to interference by infrared light sources.

Barney1337 commented 3 years ago

I am not using a laser (except the LiDAR) during any of my accuracy-measurements or during clicking the target-points for the axis-system. The depth-information in the above picture are also pretty accurate, yet the x-y-coordinates are totally different eventhough it is the same pixel. Could it maybe be related to wrong intrinsics or anything like that?

MartyG-RealSense commented 3 years ago

Could you provide more information about the meaning of "The coordinates are then used to control an x-y-axis-system pointing a laser to the exact coordinate". It sounds as though the axis refers to the point on the image that you clicked on, and you want to obtain the XY coordinates of that specific pixel?

Barney1337 commented 3 years ago

Hi @Barney1337 When using depth-color alignment and the get_distance instruction, drift in the accuracy of measurements can increase as the 0,0 coordinate is moved away from. Could you check whether your results improve if you use the resolution 1280x720 for the color stream instead of 1920x1080, plese.

Hi @MartyG-RealSense , thanks for your fast answer. I changed the RGB resolution to 1280x720, but didn't see any noticeable changes in the accuracy. While checking the results outside my piece of code using the RealSense-Viewer, I noticed something even worse: Changing the height of the measured point results in an even bigger error in x-y which I also can't explain, since the measured depth-results seem to be accurate. I attached 2 pictures to display the problem.

Isn't the pixel_to_point function especially supposed to calculate the same x-y-plane (camera x-y) on different levels by using the depth information?

EDIT: ignore the issue concerning the offset on different heights - I did a big mental lapse

You can ignore this post, sorry for that. Seems like it was too late yesterday night, when I thought about this problem. So: the offset in different heights is not as big as I thought it was.

Could you provide more information about the meaning of "The coordinates are then used to control an x-y-axis-system pointing a laser to the exact coordinate". It sounds as though the axis refers to the point on the image that you clicked on, and you want to obtain the XY coordinates of that specific pixel?

I think you got it right. The LiDAR is on top of a x-y-axis-system looking down to the ground. The coordinate-axis (x-y) of camera and axis-system are nearly congruent. I define x0,y0 (where the laser hits the ground in start-position) using the camera's world-coordinates on that pixel. After that, I click on targets in the image and calculate the needed travel-distances for the axis using the camera's world-coordinates for each target I clicked on. I attached a picture displaying the calculated distances from the startpoint (bottom right). You can clearly see a drift, especially on the bottom axis towards the left-side. (The picture doesn't show the real build-up, since I don't have access to it right now.)

prob2

MartyG-RealSense commented 3 years ago

A Python conversion of the SDK's rs-measure example in the link below for measuring between two points may provide a useful reference for checking whether your approach to measuring between the camera location and clicked-on location is correct.

https://github.com/soarwing52/RealsensePython/blob/master/separate%20functions/measure_new.py

Barney1337 commented 3 years ago

I first changed the calculation of my depth_point in my code from depth_intrinsics to color_intrinsics, because the example you mentioned did the same. I couldnt see any difference in the measured distances.

depth_point = rs.rs2_deproject_pixel_to_point(color_intrin, [c, r], depth)

I then used the example you provided to measure distances in the same videostream that I used in the picture of my previous post, also resulting in an inaccuracy of around 10 mm.

image

EDIT: btw, the camera-distance above ground in all my use-cases is around 800 mm

MartyG-RealSense commented 3 years ago

Does the observed surface have light-reflective qualities? If so, you may get greater accuracy if you apply the L515's Short Range camera configuration preset. Python code for doing so is provided in https://github.com/IntelRealSense/librealsense/issues/9748#issuecomment-916217529

ev-mp commented 3 years ago

@Barney1337 , it is not clear from the description whether you use depth image, or just RGB image and external laser (?):

To align the 2 coordinate-systems (camera and axis-system) I retrieve x=0,y=0 from the laser/axis-system on a surface and get the coordinate by clicking it in the RGB image. After that, I am transforming the clicked coordinates from the RGB image to the axis-system-coordinates by subtracting: camera x,y(axis (0,0)) - clicked (x,y

But if at the core of the use-case you're after selecting a point in RGB image and then finding a corresponding Depth/3D coordinate of that point, then the next paragraph is relevant for you: The coordinate systems of Depth and RGB sensors are separate and you cannot transform from one to another by using offsets. It requires a series of projection/transformation operations that for the specific use-case are encapsulated in rs2_project_color_pixel_to_depth_pixel API call. In case you need multiple points then it may be more appropriate to use rs2::align class to re-project all points from Depth to RGB plane of vice versa.

In case your system is different then please try to explain it in terms of 3 Coordinate Systems (CS): Depth sensor, RGB sensor and external/world reference CS.

Barney1337 commented 3 years ago

Does the observed surface have light-reflective qualities? If so, you may get greater accuracy if you apply the L515's Short Range camera configuration preset. Python code for doing so is provided in #9748 (comment)

Thanks for the idea. I will try to make another accuracy-measurement as soon as possible.

@Barney1337 , it is not clear from the description whether you use depth image, or just RGB image and external laser (?):

To align the 2 coordinate-systems (camera and axis-system) I retrieve x=0,y=0 from the laser/axis-system on a surface and get the coordinate by clicking it in the RGB image. After that, I am transforming the clicked coordinates from the RGB image to the axis-system-coordinates by subtracting: camera x,y(axis (0,0)) - clicked (x,y

But if at the core of the use-case you're after selecting a point in RGB image and then finding a corresponding Depth/3D coordinate of that point, then the next paragraph is relevant for you: The coordinate systems of Depth and RGB sensors are separate and you cannot transform from one to another by using offsets. It requires a series of projection/transformation operations that for the specific use-case are encapsulated in rs2_project_color_pixel_to_depth_pixel API call. In case you need multiple points then it may be more appropriate to use rs2::align class to re-project all points from Depth to RGB plane of vice versa.

In case your system is different then please try to explain it in terms of 3 Coordinate Systems (CS): Depth sensor, RGB sensor and external/world reference CS.

Maybe a basic CAD gives a better idea of the setup I am using the L515 in.

preview

The L515 is facing towards the ground and mounted above a 2D axis-system, which moves mirrors to reflect a Laserbeam straight to the ground (the laser is just a tool and is NOT used for any kind of measurements, it is also not turned on during the L515 depths measurements). The x- and y-axis in the camera CS are facing in the same direction like the x-y-CS of the axis-system in which the carriage moves. Below the axis-system with the carriage are several randomly placed targets, which can be "detected" on the RGB image of the L515. The user idendifies the targets on the RGB image and clicks all of them with the mouse - the pixel-coordinates of each target are then used to calculate the "real-world-coordinates" in the x-y-plane of camera/axis-system using rs.rs2_deproject_pixel_to_point(). Knowing the coordinates, the carriage should then be moved straight to each target to aim the laserbeam on it.

I am kind of a beginner in Python, so I am not sure if I got everything right. In the code shown in my first post, I am using align to align depth and color stream (as long as I did it the right way) and rs.rs2_deproject_pixel_to_point() to calculate points from the clicked-on pixels.

MartyG-RealSense commented 3 years ago

Hi @Barney1337 May I confirm the following questions with you, please.

  1. The position of the L515 is fixed and it generates an RGB image of the scene below the axis structure, showing the objects underneath the axis structure?

  2. And the program does not need to recognize the objects and determine their position automatically, because the user does that manually by clicking on areas of the RGB image corresponding to the object positions?

In this particular application where you are clicking on specific pixels of an image, it may be better to use an alternative method to alignment / deprojection. It is called rs2_project_color_pixel_to_depth_pixel. Instead of aligning the entire image to obtain coordinates, you can convert specific pixels from color pixels to depth pixels. An example of Python scripting for doing so is at https://github.com/IntelRealSense/librealsense/issues/5603#issuecomment-574019008

Barney1337 commented 3 years ago

Hey @MartyG-RealSense, first of all thanks a lot for your patience and ongoing input. The 2 questions are both "yes" and confirmed.

I rewrote my script to use rs2_project_color_pixel_to_depth_pixel , but I don't really know what to do with this. The function returns the pixel in the depth-frame that belongs to a clicked-on pixel in the color-frame (?). Since I need world-coordinates of the targets below the axis to tell the carriage how far it has to travel, I then have to calculate the world-coordinates from the depth and pixel in the image, or what is the approach on this?

MartyG-RealSense commented 3 years ago

A RealSense user used rs2_project_color_pixel_to_depth_pixel to project the color pixel to the depth pixel and then found the value of the depth pixel after that and finally deprojected the depth pixel to a 3D point. They posted their script code for doing at https://github.com/IntelRealSense/librealsense/issues/2982 though it is in C++ and so would need to be converted to a Python equivalent. The script should help to demonstrate the principle of converting the pixel coordinate into a 3D point with deprojection though.

RalphCodesTheInternet commented 3 years ago

I faced a similar problem to this. I had to get accurate measurements in distance between two xyz points, using my L515. I also noticed that the further a point is from the origin (left, right, up, down from the centre of the camera) the greater the measurement error grows. Only when a point is very close to the centre, regardless of the range, the measurement is accurate. At first I though I was doing something wrong. I reverted back to the realsense viewer and used the measuring tool only to realise that this problem is not on my side.... I solved the problem by building my own visualiser and using some tricky math to finally get the measurement error to below something like 0.5cm for any point in 3D space regardless of the distance from the origin. My code is written in javascipt and is quite long and complex. Id be happy to help if your problem persists... but even if you do solve this issue, the device has several other issues you're going to have to face..

MartyG-RealSense commented 3 years ago

Thanks so much @RalphCodesTheInternet for your kind advice to @Barney1337 :)

Barney1337 commented 3 years ago

A RealSense user used rs2_project_color_pixel_to_depth_pixel to project the color pixel to the depth pixel and then found the value of the depth pixel after that and finally deprojected the depth pixel to a 3D point. They posted their script code for doing at #2982 though it is in C++ and so would need to be converted to a Python equivalent. The script should help to demonstrate the principle of converting the pixel coordinate into a 3D point with deprojection though.

I will try that one as soon as I got time for it. Because of @RalphCodesTheInternet 's comment (thanks for that one!), I might stay with the code I have for now, which results in an error of around 10mm max. Hardcoding a simple correction-factor of around ~1,5% on the resulting traveldistances gets me an accuracy of <5 mm in my operating range (at least in the same plane). I might aswell try to use a bit more complex correction-factors later on, but that depends on how much time I have. For the use-case I have, an accuracy <3mm would be necessary, but making the work-flow of my setup a bit less efficient might do the trick for now.

Thanks a bunch for your help @MartyG-RealSense . If there is no more ideas on this topic, it can probably be closed, eventhough I would be curious, if more people are facing similar issues.

MartyG-RealSense commented 3 years ago

Thanks very much @Barney1337 for the detailed update! Let's keep the case open for another week to see whether anybody else has thoughts to contribute. Thanks again!

MartyG-RealSense commented 3 years ago

Hi again @Barney1337 As there have been no further comments made by other RealSense users, I will close this case as you suggested above. Thanks again!

rhanks26 commented 3 years ago

Hey I'm also noticing this issue, and curious what the root cause is. @RalphCodesTheInternet , did you ever figure that out?

RalphCodesTheInternet commented 3 years ago

@rhanks26 I could never figure out the root cause. Dont even think there are any answers out there.

Essentially what I did was to take the raw depth values and computed xyz values on another tricky way that compensates for this error as much as possible. I managed to get it relatively low in all directions. I cant remember all my steps, I will have to go check up on the code again... but in short what I can remember from the top of my head was this:

You will basically need to calculate where the selected point is compared to the center point of the camera. You will be using raw depth and the X,Y FOV provided in the datasheets, as well as the raw depth pixel coordinates. Then you will be calculating unit sizes for the point in both the X and Y direction using these values, thereafter you will use some trigonometry to figure out the true distance from the sensor.

My solution was created in javascript and the code is relatively long and complex, but Id be happy to help if this is a serious problem :)

5204338 commented 1 year ago

@rhanks26 I could never figure out the root cause. Dont even think there are any answers out there.

Essentially what I did was to take the raw depth values and computed xyz values on another tricky way that compensates for this error as much as possible. I managed to get it relatively low in all directions. I cant remember all my steps, I will have to go check up on the code again... but in short what I can remember from the top of my head was this:

You will basically need to calculate where the selected point is compared to the center point of the camera. You will be using raw depth and the X,Y FOV provided in the datasheets, as well as the raw depth pixel coordinates. Then you will be calculating unit sizes for the point in both the X and Y direction using these values, thereafter you will use some trigonometry to figure out the true distance from the sensor.

My solution was created in javascript and the code is relatively long and complex, but Id be happy to help if this is a serious problem :)

@rhanks26 I could never figure out the root cause. Dont even think there are any answers out there.

Essentially what I did was to take the raw depth values and computed xyz values on another tricky way that compensates for this error as much as possible. I managed to get it relatively low in all directions. I cant remember all my steps, I will have to go check up on the code again... but in short what I can remember from the top of my head was this:

You will basically need to calculate where the selected point is compared to the center point of the camera. You will be using raw depth and the X,Y FOV provided in the datasheets, as well as the raw depth pixel coordinates. Then you will be calculating unit sizes for the point in both the X and Y direction using these values, thereafter you will use some trigonometry to figure out the true distance from the sensor.

My solution was created in javascript and the code is relatively long and complex, but Id be happy to help if this is a serious problem :)

Dear Sir, my project also needs to calculate the precise X Y coordinates of the two targets in the camera coordinate system, based on this, and then calculate, but only using the rs2_deproject_pixel_to_point function to get the X Y coordinates of the two targets is very inaccurate, please help me.

5204338 commented 1 year ago

Hey I'm also noticing this issue, and curious what the root cause is. @RalphCodesTheInternet , did you ever figure that out?

Dear Sir, my project also needs to calculate the precise X Y coordinates of the two targets in the camera coordinate system, based on this, and then calculate, but only using the rs2_deproject_pixel_to_point function to get the X Y coordinates of the two targets is very inaccurate, please help me.