OpenKinect / libfreenect2

Open source drivers for the Kinect for Windows v2 device
2.08k stars 752 forks source link

Coordinate (x,y) in color space [1920, 1080] to coordinate (x1,y1) in depth space [512, 424] #1113

Open brunoimbrizi opened 4 years ago

brunoimbrizi commented 4 years ago

Given a coordinate (x, y) in RGB color space [1920, 1080], how to get a coordinate (x1, y1) in depth space [512, 424]?


The example below is using OpenCV to find a corner of the chessboard. It is quite precise when the Kinect's RGB frame is used [1920, 1080], but it struggles to find the corners when passing in the IR or Registered frames [512, 424]. So it makes sense to use the high resolution image to find corners and then convert the coordinates to find their depths.

2020-03-06-11_15_43-Greenshot

The idea is to get an equivalent (x1, y1) in depth space [512, 424] for (1169, 630) and then use Registration getPointXYZ(undistorted, y1, x1) to find the detected corner in the 3D point cloud.


Similar questions have been asked before, but the answers still seem unclear. It would be great to get some clarification since more people seem to be trying to achieve the same.

https://github.com/OpenKinect/libfreenect2/issues/1021 @xlz suggests using Registration apply(), but it is not clear how to use it for a coordinate in color space.

https://github.com/OpenKinect/libfreenect2/issues/708 @kalluwa seems to have successfully mapped color to depth using a 5x3 upsampling filter, but it is not clear how to do that to convert a coordinate.

https://github.com/OpenKinect/libfreenect2/issues/1081 poses a similar question and @floe suggests using the Registration class, but the methods in that class expect coordinates already in depth space.

https://github.com/OpenKinect/libfreenect2/issues/913 @madsherlock creates a high density point cloud using all 1920x1080 pixels from bigdepth, but the intention here is to find an equivalent point in the 'low density' 512x424 point cloud

xlz commented 4 years ago

Yeah I misunderstood the direction of the mapping in that question.

The reason there is no direct color_coord -> depth_coord mapping is because the known mapping of depth_coord -> color_coord is nonlinear, see https://github.com/OpenKinect/libfreenect2/blob/master/src/registration.cpp#L87. You can create more accurate mapping of color_coord -> depth_coord with your own extrinsic calibration than the factory calibration, but this library can only work with the factory one.

bigdepth contains what you're looking for but for all color pixels. It is indeed produced with a 5x3 upsampling filter inside Registration class. Since we don't have direct color_coord -> depth_coord mapping what happens is that while the function iterates through each depth pixel, it calculates the corresponding color coordinate and assigns depth values to the 5x3 color area around that coordinate.

But bigdepth may be a waste when you only care about one pixel. The alternative is to produce an inverse mapping of RegistrationImpl::depth_to_color, which is cubic. I think typically only numerical methods are suitable for that and we don't have an elegant implementation to be included here.

brunoimbrizi commented 4 years ago

Thank you for the clear answer.

As you mentioned bigdepth does assign depth to all color pixels, so the information is there, it is just not exposed.

As a quick experiment I tried creating getPointXYZBig which is cheap a copy of getPointXYZ using bigdepth instead of undistorted and color camera parameters instead of depth camera parameters. It does return useful world coordinates, but only if the point cloud is also generated using bigdepth coordinates [1920, 1080] (which was the case in https://github.com/OpenKinect/libfreenect2/issues/913). If the point cloud is generated with RGB registered pixels, then coordinates returned by getPointXYZBig are off.

After posting the question here I realised what I needed was an array similar to color_depth_map - last parameter of Registration apply():

[out] color_depth_map Index of mapped color pixel for each depth pixel (512x424).

Essentially that, but in the other direction:

[out] depth_color_map Index of mapped depth pixel for each color pixel (1920x1082).

I managed to hack apply() to expose the array above, but the solution is not pretty so I am not posting any code here. The good news is that it worked. In case anyone else wants to try an implement a similar hack (or a tidy version of it) the key is storing index in another pointer just like z is stored here https://github.com/OpenKinect/libfreenect2/blob/master/src/registration.cpp#L230

Coordinates in depth space [512, 424] can then be calculated with something like this:

getDepthCoord(cx, cy) {
  idx = depth_color_map[(cy + 1) * 1920 + cx]
  dx = idx % 512
  dy = floor(idx / 512)
}
PengyuanHan commented 3 years ago

@brunoimbrizi Are you sure that it is not depth_color_map[(cy - 1) * 1920 + cx]. And, i do not get understand your way to get the depth_color_map. Could you please give me an interpretation. Thank you very much.