cwi-dis / cwipc

MIT License
16 stars 2 forks source link

Point Cloud Misalignment Issue During Registration Process with RealSense D455 #106

Closed ashutosh3308 closed 4 months ago

ashutosh3308 commented 6 months ago

I am doing point cloud registration using two realsense cameras.

Registration using one Camera

I use cwipc_register --rgb --nofine command to do the registration process. The output looks like this. PXL_20240507_135826904

Registration using two Cameras

Sync cables were also used. I use cwipc_register --rgb --interactive command to do the registration process. The output looks like this. PXL_20240507_141058861 MP

I use cwipc_register --rgb --nofine command to do the registration process. The output looks like this. PXL_20240507_144104452 MP

jackjansen commented 6 months ago

There have also been other issues reported for the D455, even for single camera: mismatch between RGB and D data. (By Sebastian Uitz from the Spirit project).

I'm wondering whether there is some systematic offset that we need to apply to D and/or RGB that we are not doing.

FireTotemGames commented 4 months ago

Hi @jackjansen Is there already a fix for this issue? Sebastian

jackjansen commented 4 months ago

Went through our code (in RS2Camera.cpp and the sample code in https://github.com/IntelRealSense/librealsense/tree/master/examples/pointcloud and the only obvious difference is that we first call map_to(color) and then calculate(depth), but the sample code does it the other way around.

Need to experiment with the sample code.

jackjansen commented 4 months ago

Actually, looking at the code, I think that the for loop (line 434) is a more obvious candidate for where the error is. There's a lot of manual calculations there, with stride and such. Seems like the place where a consistent offset error could be introduced.

jackjansen commented 4 months ago

And also look at the alignment examples.

jackjansen commented 4 months ago

And we should do self-calibrations of the cameras. The RealSense SDK and firmware has learned a lot of new tricks since we last looked at it (years ago).

This is a high-level overview: https://www.intelrealsense.com/self-calibration-for-depth-cameras/

And here is a very in-depth set of instructions: https://dev.intelrealsense.com/docs/self-calibration-for-depth-cameras?_ga=2.209102491.808217773.1721646934-1540133113.1721646934

Don't forget to look at the "additions" later in the document: I'm wondering whether the "wavy pattern" we've been seeing since day one could have to do with focal length discrepancies between the two IR cameras.

jackjansen commented 4 months ago

But note that we should first take a look at the Realsense examples. If the issue with RGB/D alignment does not show up in any of the Realsense examples (and in the Realsense Viewer) then it is very likely that the problem is with our code.

jackjansen commented 4 months ago

A really nice example to test, and to see how it compares to our code, is https://github.com/IntelRealSense/librealsense/blob/master/wrappers/pcl/pcl-color/rs-pcl-color.cpp because it also converts realsense frames to PCL point clouds.

It seems that example hasn't been touched since 2018, so whether that example has the problem like our code or not should give us a better lead for where to look.

I already noticed that their traversal if the RGB and R data is slightly different from ours.

jackjansen commented 4 months ago

@troeggla and @jackjansen did a lot of tests today, with single and multiple D455's and with single D415 and D435s.

We have done all tests with both sensors at 1280x720@15fps. The cameras we oriented in portrait mode, with USB outlet at the top (or close to the top, for the D455).

We did a coarse calibration with cwipc_register --rgb --interactive --verbose. We ignored the fact that the color RGB are swapped (which has been fixed on the master branch by now).

The Aruco markers were shown to be correctly detected (--verbose shows where the marker is found), with the correct bounds in the RGB image.

After that we looked at the point cloud with cwipc_view, and looked at where the origin was respective to where it should be (the center of the Aruco marker).

Both the D415 and the D435 showed the origin with an offset, the D435 a fairly large offset (20 cm or so) in positive Z direction, the D415 with a much smaller offset (5cm or so). There may have been a slight offset in the X direction, and if there was an offset in the Y direction it was really small.

The D455 was weirder. cwipc_view started off correctly, about after one or two seconds the origin shifted in positive Z direction.

Four D455s was even weirder again. After one or two seconds camera 1 would shift position. A few seconds later camera 4 would shift. After a minute or so camera 3 (or 2, unsure) would shift. Then the other camera would shift after another minute or so. We think that this shifting continued, i.e. every once in a while one of the cameras would move its origin.

To make this even weirder it appears the order (camera 1, camera 4, etc) and the interval until it happened are fixed. And they are fixed to the software order of the cameras (i.e. the order in the cameraconfig.json file).

jackjansen commented 4 months ago

Next plans:

jackjansen commented 4 months ago

First: did captures with the new #119 code, and ran cwipc_register on those files. Exactly the same result: mismatched origins.

Next fixed up the debug and verbose prints to be more helpful.

Then started look at this in a roundabout way:

@jackjansen is going to inspect the code for mapping the (u, v) values of the Aruco marker corners to (X, Y, Z)

What I did in stead was run cwipc_register --interactive --no_aruco and selected the four corners of the A4 paper manually. This coarse calibration was much better. So I think, indeed, that the u,v -> x, y, z mapping is the main suspect.

jackjansen commented 4 months ago

More corroboration of that idea. After doing a reasonable coarse calibration, here are the corners of the A4 as hand-selected with --no_aruco:

cwipc_register: find_markers_all_tiles: camera 1: marker 9999:  3D corner: [-0.15325038 -0.00106651  0.10331016], distance to next: 0.2187547020452913
cwipc_register: find_markers_all_tiles: camera 1: marker 9999:  3D corner: [-0.14465462  0.00161063 -0.1152592 ], distance to next: 0.3093914899660473
cwipc_register: find_markers_all_tiles: camera 1: marker 9999:  3D corner: [ 0.16442417 -0.0055699  -0.10334954], distance to next: 0.20144382737216643
cwipc_register: find_markers_all_tiles: camera 1: marker 9999:  3D corner: [ 0.15508209 -0.00142229  0.0978348 ], distance to next: 0.308381290043388

You can see they're all at approximately y=0 and the distances are fairly consistent with A4 paper (which is 210x297mm).

Here re the results for the aruco-based detection:

cwipc_register: find_markers_all_tiles: camera 1: marker 0:  3D corner: (-0.8453696966171265, 0.14916516840457916, 1.8380000591278076), distance to next: 0.16895448987367104
cwipc_register: find_markers_all_tiles: camera 1: marker 0:  3D corner: (-0.858157753944397, -0.011399688199162483, 1.7869999408721924), distance to next: 0.1581594424754204
cwipc_register: find_markers_all_tiles: camera 1: marker 0:  3D corner: (-0.8007153868675232, -0.0656745508313179, 1.9240000247955322), distance to next: 0.16717336048597747
cwipc_register: find_markers_all_tiles: camera 1: marker 0:  3D corner: (-0.7858124375343323, 0.09463714063167572, 1.968999981880188), distance to next: 0.1538874675969009

The values are ludicrous. It looks like Z could be distance from the camera, and X and Y somehow based on u,v.

jackjansen commented 4 months ago

Should have looked at the code earlier. RS2Camera::map2d3d() has a comment stating that it is unfinished...

But still: it should have returned the correct values for when the unity matrix was still in place. So there's more to the problem.

jackjansen commented 4 months ago

I should re-read https://github.com/IntelRealSense/librealsense/issues/8221

It may be that for RealSense the whole logic is wrong: we are assuming that u,v values from the color image are valid for use in the depth image and vice versa. This is true for the Kinects, where we map one to the other before returning them to the user as auxiliary_data.

However, this is not true for the Realsense case, where there is a transform between the two images and we need to apply rs.rs2_project_color_pixel_to_depth_pixel() or the reverse.

So that means finding the depth value is wrong (because we use color u,v to index the depth image), so that means we'll always be off by a little bit.

This may require a new auxiliary_operation, something like map2d2d.

jackjansen commented 4 months ago

But also look at https://dev.intelrealsense.com/docs/projection-in-intel-realsense-sdk-20

There is apparently an rs2::align processing block which will align the images. We would only need to use this when we're requesting auxiliary data color and depth images, so we don't much care about efficiency.