Depth & color alignment inaccuracy

tim-depthkit commented 4 years ago

Describe the bug The depth and color alignment is inaccurate, and the problem becomes more pronounced the further a subject is from the sensor.

I'd like to know:

Is the factory calibration process intended to solve only for close ranges? If so, can you provide information about the process and what we should expect in terms of usable aligned ranges from the factory calibration?
Is there anything that can be done to fix this issue for existing devices, aside from building our own calibrations using OpenCV or similar?

To Reproduce Point the Azure Kinect at an object with recognizable features in both depth and color. Note that the features are not properly aligned when viewing the resulting reconstructed 3D object. Note that the alignment issue is worse the further the object is placed from the sensor.

Expected behavior The depth and color alignment should maintain accuracy within the operating range of the device. Currently the calibration appears to be solving for very close range, at the expense of medium and long range accuracy.

Screenshots To visualize the issue, I used a cube with colored tape on each edge. This object has features in both depth and color that should align perfectly in the reconstruction. These examples were taken with the cube sitting about ~6-7ft from the sensor.

Imgur gallery: https://imgur.com/a/BeiNN0n

Overview of the fiducial as seen in the Azure Kinect Viewer: Overview of fiducial

Viewing the object from the top in the Azure Kinect Viewer, a shift is observed in the color-depth alignment, particularly noticeable along the front facing vertical edge of the box, where it can be seen that the border between red and green tape, which should be perfectly centered along the edge of the point cloud, gets projected too far to the right. Overview of fiducial

A more extreme angle showing the same phenomenon. Overview of fiducial

Desktop (please complete the following information):

OS with Version: Windows 10 v1909
SDK Version: 1.3.0
Firmware version:
- RGB: 1.6.102
- Depth: 1.6.75

Additional context There are a few other closed issues that show similar alignment issues, but they have been written off as occlusion problems. This is clearly not an occlusion problem, and in fact the Kinect for Windows V2 does NOT exhibit the same effects, and maintains very accurate color & depth alignment at the same distances used in this test.

JacobErvin commented 4 years ago

We are seeing the same thing. Here's a screenshot (captured in Depthkit) showing the offset between the texture and the mesh.

JacobErvin commented 4 years ago

You can see that the grid board is being significantly projected onto the whiteboard in the background.

rabbitdaxi commented 4 years ago

Thank you very much for the feedback.

You are correct. If you are using k4a_transformation_depth_image_to_color_camera(), then occlusion is handled by this function, therefore, the small depth and color misalignment with this API is not caused by occlusion. However, if you are using the k4a_transformation_color_image_to_depth_camera function, then you will see issues because of occlusion. Here is the doc which describes the difference of the transformation functions.

Now, put aside occlusion, just want to share some thoughts around the small misalignment from the depth to color transformation. Two takeaways:

It is expected that you might not be able to get the perfect alignment.
It is expected that the misalignment should not be very large though.

Here are some details:

First of all, if you are interest in the transformation algorithm, we have the reference code in the GitHub you can dig and learn, to summarize the steps:

Unproject the depth image to 3d point cloud in depth camera space (here, the depth value and depth camera intrinsics are consumed)
Transform above 3d point cloud from depth camera space to color camera space (here, the depth to color extrinsics is consumed)
Project the 3d point cloud from color camera space to the color image with interpolation (here, the color camera intrinsics is consumed, and also because of the resolution difference, bilinear interpolation in a triangle area is used)

There are few things can contributed to the end to end alignment error, which includes both camera calibration and the depth error. Even the calibration is perfect, you still can get some small misalignment because of the depth error itself. Depth could contain systematic or random error, e.g. quote from the page: "typical systematic error < 11 mm + 0.1% of distance without multi-path interference". That being said, other than depth error, the calibration can contribute to the error too. @tesych @rajeev-msft might know the calibration accuracy spec.

The team has some test in manufacturing to measure the depth to color transformation reprojection error described as the following steps:

The test scene includes certain marker board with coded marker target associated with unique id and covers most of the depth and color camera FoV overlapping.
The test captures n depth frames and 1 color image.
The test computes a mean depth image (average out the random error) and a mean ir image from the n depth frames.
The test detects the marker centers 2d pixel coordinates (float) in both the mean ir image (let us define a set m_ir) and the color image (let's defined a set m_color)
The test unprojects each marker center from m_ir to depth camera 3d space then transforms to color camera 3d space, then projects to color image (let us define a set m_ir')
The test search the markers from m_color and m_ir', once find a match based on the unique id, it computes the Euclidean distance between them in the unit of pixels in the color image, this is the transformation reprojection error.

Intuitively, at the end of the day, it is the color pixel we are mapping to the depth pixel visually defines the alignment. That is why we choose to measure the transformation reprojection error in the color image space. Moreover, color resolution normally is bigger than depth on this device. You can also imagine that with larger color resolution and smaller depth resolution (e.g. binned depth mode) it could mean larger transformation reprojection error (pixels) in the color image (because the angular error can translate to more color pixels), in that case, the color point cloud misalignment can be more visible.

Based on statistic of many devices, a few color pixels of transformation reprojection error could be expected (assume the color image resolution is K4A_COLOR_RESOLUTION_3072P and depth mode is unbinned modes).

It is hard to tell how much the misalignment from a color point cloud perspective, it will be helpful if you have quantitative reprojection error similar as what above test described.

tim-depthkit commented 4 years ago

Hi @rabbitdaxi Thank you for your response.

It is hard to tell how much the misalignment from a color point cloud perspective, it will be helpful if you have quantitative reprojection error similar as what above test described.

The color-depth misalignment shown in the point cloud is about 3/4 of an inch of error at about 6 ft away from the sensor. I am estimating this based on the width of the colored tape (1/2") used to mark the cube, and the fact that it is a bit more than one tape's width shifted to the right from where the center of the edge is in the point cloud. For a more accurate error number, I would need to write a more elaborate test application using the IR image to test it as you have described, but I don't have the time to do that at the moment, and it is obvious to me that there is a problem.

It is expected that the misalignment should not be very large though.

For our purposes, this amount of misalignment is very large. Error of this magnitude means that people in front of the sensor at a distance great enough to capture the full figure may have facial features such as their nose or eyes mapped onto their cheeks, which leads to very strange and not very life-like reproductions.

@tesych @rajeev-msft might know the calibration accuracy spec.

I am very curious to hear what the re-projection error tolerances are for this process.

The team has some test in manufacturing to measure the depth to color transformation re-projection error described as the following steps

We employ a similar approach for computing alignment error, using temporal median filtering when calibrating the Azure Kinect to an external color camera, and we are able to get error numbers in the fractions of millimeters range.

When pairing two lenses, it is possible to "solve" the camera system for close range, having very low error at the solved distance. Naturally, the error at a given distance will increase the further it is from the ideally solved range. It appears to me that the factory calibration method is solving for high accuracy at very short ranges rather than a lower average error throughout the usable range of the sensor. Can you confirm that this is true or not?

In our experience, getting a good calibration depends highly on where in space the markers used for calibration are placed in relation to the cameras. This is why I am curious about the factory calibration process, specifically how far the markers are from the sensor during the calibration process:

Are the sensors calibrated with markers placed at a single distance from the sensor, or are multiple distances used?
What are the distances used during the factory calibration process?
How large is the volume of space used for calibration? (computed as the volume of a bounding box containing all 3D points used in the calibration process)

This is why our calibration process involves computing the bounding box, so we can determine an error per cubic meter metric, which is more useful than the error by itself, since it penalizes solving the system for only a small area.

Can you also please respond to the specific questions in my original post:

Is the factory calibration process intended to solve only for close ranges? If so, can you provide information about the process and what we should expect in terms of usable aligned ranges from the factory calibration?
Is there anything that can be done to fix this issue for existing devices, aside from building our own calibrations using OpenCV or similar?

Thank you!

tesych commented 4 years ago

Thank you @tim-depthkit for the question. As @rabbitdaxi mentioned, there are many thinks can contribute to the misalignment that you experienced. Our team is currently investigating the issue further and will be happy to share our finding soon.

jasjuang commented 4 years ago

@tim-depthkit I've gone through all these troubles and you can see the conversation in #803. My conclusion is the depth is not accurate. The accuracy of the depth highly depends on the material. This is not solvable by us unless there's an update on how depth is calculated.

mellinger commented 4 years ago

The depth/color alignment issue in this Github thread is making me wary of buying a bunch of Kinect Azures for a multi-cam setup since I haven't been able to get alignment either using borrowed sensors. Are there people who have actually been able to get a color/depth aligned multi-cam cloud using Kinect Azure at this point? I can't find a clear answer anywhere.

tim-depthkit commented 4 years ago

@jasjuang I do not think that #803 stems from the same issue. I have not done a deep dive into the multi-camera examples shipped with the SDK, but our own internal multi-sensor tests do not exhibit the level of mis-aligned pointclouds you posted in that issue.

We've been able to get very closely aligned pointclouds with multiple sensors, despite this color/depth alignment issue. Of course fixing this would improve the accuracy of any multi-sensor alignment that uses the color camera and the built-in calibrations, but as I mentioned in my original post, the alignment error is around 1.5" at 6ft, which is actually difficult to notice given the noise profile of the depth data at that distance.

I think it is important to clarify that while this issue does lead to slight multi-sensor point cloud mis-alignment, the effect would not be as great as what is outlined in #805

@tesych Any updates on analysis of this issue?

jasjuang commented 4 years ago

@tim-depthkit Can you show us an example of your multi Azure Kinect alignment result? Are you doing anything special for the calibration? Or were you able to obtain good results just by using standard checkboard/charuco corners + zhang's method?

youliangtan commented 4 years ago

I am also facing similar issue on my side while trying out with different kinects. Totally agree with what @tim-depthkit has mentioned. Furthermore, I noticed that the severity of the rgb-depth frame registration error differs on every kinect sensor. This highly seems to be the problem of the default factory calibration parameters. Still waiting for a solution for this too.

tim-depthkit commented 4 years ago

@qm13 Noticed this issue is now closed. What is the outcome? Is there a fix for misaligned sensors?

ChristopherRemde commented 4 years ago

I'm also very interested to hear what the outcome is. Will this bug be fixed in future release?

GitZinger commented 4 years ago

No. It does not handle it by the k4a_transformation_depth_image_to_color_camera(). Sadly.

AswinkarthikeyenAK commented 4 years ago

I am facing the issue as well. Kindly let me know if there is a way to fix this-to align color and depth images.

prajval10 commented 3 years ago

We have the same issues as shown in https://github.com/microsoft/Azure-Kinect-Sensor-SDK/issues/1058#issuecomment-582641359 with factory calibrated Azure Kinect out-of-the-box. Please let us know if there are any workarounds for this?

seigeweapon commented 3 years ago

I find this thread after openning a new bug report #1443 . I think they are the same issue. Wondering what's the update? Given the difficulty of changing calibration data after a device is out of factory, I think the azure kinect team can consider open sourcing the calibration process.

meteorshowers commented 3 years ago

poor device

cpatel245 commented 2 years ago

Hello everyone, I am facing the similar issue regarding depth and color shift. Does anyone already found any solution/workaround for this issue?

I am also curious, since this topic is open from so long, is this issue solely related to Hardware or there is also a way to tackle it on software side?

microsoft / Azure-Kinect-Sensor-SDK

Depth & color alignment inaccuracy #1058