microsoft / Azure-Kinect-Sensor-SDK

A cross platform (Linux and Windows) user mode SDK to read data from your Azure Kinect device.
https://Azure.com/Kinect
MIT License
1.49k stars 619 forks source link

Transform 2d to 3d C# #1306

Open juspus opened 4 years ago

juspus commented 4 years ago

Hi, I am working on detecting a rectangle using the depth camera. At this point, I am struggling to produce accurate 3d coordinates of a selected pixel using the NFOV depth camera. I have tested this by selecting 2 points on a testing board, transforming them using Depth 2d to Color 3d, calculating the distance between received coordinates, and measuring distance between selected real-world points (918mm). At 2.7 meter distance, I am getting 2 cm error at the image center, while in the corners the error reaches up to 6 cm.

Shouldn't the Transformation functions correct for distortion? Am I missing crucial steps for receiving accurate data? Might this be something else?

Thank you for your help!

qm13 commented 4 years ago

The 3D color mode of the k4aviewer shows how to map RGB on to depth. Please verify you are doing this correctly.

juspus commented 4 years ago

Which class would I need to look into more precisely?

juspus commented 4 years ago

I believe you have misinterpreted my original question. I have no problem mapping RGB to Depth and vice versa. The problem I am having is when I try to convert 2D pixel value of Depth camera to 3D Coordinate System of Color camera. I am using this kind of conversion because of coordinate system orientations provided in hardware documentation. Frankly, I am not even using color camera image. Only Depth image. As I understand the Depth camera is tilted by 6 degrees about X-axis, while color camera is not, thus I am trying to convert the Depth image pixel coordinates to Color camera real world coordinates.

Is this a bad approach in getting exact coordinates of the rectangle I am trying to find?

public Vector3D GetPoint(Vector2D point)
{
   var sourcePoint = new System.Numerics.Vector2((float)point.X, (float)point.Y);
   var depth = FindDepthAtPoint(point, depthImg);
   var gotVec = _Device.GetCalibration().TransformTo3D(sourcePoint, (float)depth, K4A.CalibrationDeviceType.Depth, K4A.CalibrationDeviceType.Color);
   if (gotVec.HasValue)
   {
       return new Vector3D(gotVec.Value.X, gotVec.Value.Y, gotVec.Value.Z);
   }
   else
   {
        throw new ArgumentException();
   }
}

Shouldn't this be enough to get the exact coordinates of the point I am trying to find?

qm13 commented 4 years ago

Couple of points:

  1. How did you choose the 2d points. Is that by manually looking at the IR image or by manually looking at depth image? The point is, you should start with an accurate 2d pixel which matches exactly the 3d point, often you will need some texture when doing this by human eye e.g. a target board with visible markers in IR spectrum, then you can pixel inspect (assuming human eye can give enough precision) to find the center of the marker or use CV algorithm to detect 2d points.

  2. Try only to transform the 2d depth pixel to the 3d point in depth camera space (instead of color 3d space). You only need to change the last parameter of the TransformTo3D to K4A.CalibrationDeviceType.Depth and then compare the relative distance of point A to point B with the real world measurement. This can help narrow down whether only using depth camera intrinsics can give a better result (instead of going all the way to color space). If you see better results using 3d points from depth camera space comparing to the 3d points from color space, then there might be some calibration issue related to extrinsics or color intrinsics.

Finally the depth camera tilt of 6 degrees relative to the colot camera should not matter. The camera calibration intrinsics is calibrated for each camera’s distortion, and the extrinsics calibrated to take into account camera mechanics.

juspus commented 4 years ago

1.

I am using OpenCVSharp library to find contours and from there I find the corner points of the rectangle I am trying to find.

test rezultatasTest gotRectTestt

The rectangles real dimensions - 499x395 mm While the data I get from the camera while searching for individual edges are:

BottomLeft-BottomRight : 501.871409658183 - 3 mm off TopLeft-TopRight : 505.386743966989 - 6 mm off TopLeft-BottomLeft : 417.113611458648 - 22mm off TopRight-BottomRight : 417.66934201898 - 23 mm off

The further I go from the image center or the bigger the object I use - the worse the errors get.

test rezultatasTest gotRectTest Real dimensions - 1797x898 mm

BottomLeft-BottomRight : 1801.43245903415 TopLeft-TopRight : 1850.56666681602 TopLeft-BottomLeft : 947.24760041082 TopRight-BottomRight : 940.148730668165

These were done with depth 2D to depth 3D transformation. Might this be an issue of perspective? If so - how do I rectify this?

2.

I have tried different kinds of transformations. There didn't seem to be any improvement. Another image of the last rectangle but this time I used depth 2D to color 3D: BottomLeft-BottomRight : 1809.25862564234 TopLeft-TopRight : 1849.31981369433 TopLeft-BottomLeft : 941.742529142437 TopRight-BottomRight : 937.416335253367

qm13 commented 4 years ago

@juspus we are somewhat confused. Could you please answer these questions.

Are the images you include 2d depth image? If this is true, then the 2d depth image is expected to be distorted.

How do you compute the following? BottomLeft-BottomRight : 1801.43245903415 TopLeft-TopRight : 1850.56666681602 TopLeft-BottomLeft : 947.24760041082 TopRight-BottomRight : 940.148730668165

The correct approach is to transform the 2d depth to 3d for those 4 corners and then find the distance in 3D space. Definitely do not do this in 2d depth space. Take the bottomleft and bottomright as example:

juspus commented 4 years ago
yeongwoonIm commented 3 years ago

That I am looking for to it. I have same problems. I think that the problem is lens distortion. 2D Depth Data has been applied undistortion. but not perfectly. so I requested it #1380

juspus commented 3 years ago

That I am looking for to it. I have same problems. I think that the problem is lens distortion. 2D Depth Data has been applied undistortion. but not perfectly. so I requested it #1380

These two issues are not related. It is stated in Azure Kinect Documentation that 2D images of Depth camera are to be distorted by default. However, transformed 2d to 3d should yield accurate undistorted results. Thus is my issue, I am receiving inaccurate results after performing 2d Depth to 3d depth/color transformation.

TomitaTseng commented 3 years ago

I tried to convert the 3D coordinates obtained by Kinect DK to screen coordinates and use OpenCVSharp to draw a circle, but I don't know how to write it. Does the coordinate conversion function have C# or VB.Net program examples?

juspus commented 3 years ago

I tried to convert the 3D coordinates obtained by Kinect DK to screen coordinates and use OpenCVSharp to draw a circle, but I don't know how to write it. Does the coordinate conversion function have C# or VB.Net program examples?

Your issue does not seem to be related to the issue I am experiencing. Please refer to this [answer].(https://github.com/microsoft/Azure-Kinect-Sensor-SDK/issues/1380#issuecomment-708070765)

TomitaTseng commented 3 years ago

I'm not good at English, so I don't convey the meaning. I'm sorry!

I want to convert body tracking to screen coordinates.

How to draw a point on the screen after getting the position coordinates of the Head joint.

Could you please tell me, thank you very much.

///////////////////////////////////////// var calibration = device.GetCalibration(deviceConfig.DepthMode, deviceConfig.ColorResolution); var trackerConfig = new TrackerConfiguration(); trackerConfig.ProcessingMode = TrackerProcessingMode.Gpu; trackerConfig.SensorOrientation = SensorOrientation.Default; using(var tracker = Tracker.Create(calibration, trackerConfig)) { var wantExit = false;

            while (!wantExit)
            {
                // Capture a depth frame
                using (Capture sensorCapture = device.GetCapture())
                {
                    // Queue latest frame from the sensor.
                    tracker.EnqueueCapture(sensorCapture);
                }

                // Try getting latest tracker frame.
                using (Frame frame = tracker.PopResult(TimeSpan.Zero, throwOnTimeout: false))
                {
                    if (frame != null)
                    {
                        Console.WriteLine("NumberOfBodies : " + frame.NumberOfBodies);

                        if(frame.NumberOfBodies > 0)
                        {
                            var skeleton = frame.GetBodySkeleton(0);
                            var headJoint = skeleton.GetJoint(JointId.Head);
                            Console.WriteLine("  Head : x=" + headJoint.Position.X + ",y=" + headJoint.Position.Y + ",z=" + headJoint.Position.Z);
                        }
                    }
                }

                Console.WriteLine("Press Esc to Exit. Any other key to continue.");
                var key = Console.ReadKey();
                if (key.Key.Equals(ConsoleKey.Escape)) wantExit = true;
            }
        }
juspus commented 3 years ago

I'm not good at English, so I don't convey the meaning. I'm sorry!

I want to convert body tracking to screen coordinates.

How to draw a point on the screen after getting the position coordinates of the Head joint.

Could you please tell me, thank you very much.

///////////////////////////////////////// var calibration = device.GetCalibration(deviceConfig.DepthMode, deviceConfig.ColorResolution); var trackerConfig = new TrackerConfiguration(); trackerConfig.ProcessingMode = TrackerProcessingMode.Gpu; trackerConfig.SensorOrientation = SensorOrientation.Default; using(var tracker = Tracker.Create(calibration, trackerConfig)) { var wantExit = false;

            while (!wantExit)
            {
                // Capture a depth frame
                using (Capture sensorCapture = device.GetCapture())
                {
                    // Queue latest frame from the sensor.
                    tracker.EnqueueCapture(sensorCapture);
                }

                // Try getting latest tracker frame.
                using (Frame frame = tracker.PopResult(TimeSpan.Zero, throwOnTimeout: false))
                {
                    if (frame != null)
                    {
                        Console.WriteLine("NumberOfBodies : " + frame.NumberOfBodies);

                        if(frame.NumberOfBodies > 0)
                        {
                            var skeleton = frame.GetBodySkeleton(0);
                            var headJoint = skeleton.GetJoint(JointId.Head);
                            Console.WriteLine("  Head : x=" + headJoint.Position.X + ",y=" + headJoint.Position.Y + ",z=" + headJoint.Position.Z);
                        }
                    }
                }

                Console.WriteLine("Press Esc to Exit. Any other key to continue.");
                var key = Console.ReadKey();
                if (key.Key.Equals(ConsoleKey.Escape)) wantExit = true;
            }
        }

Entirely not related. Sorry.

yeongwoonIm commented 3 years ago

I'm not good at English, so I don't convey the meaning. I'm sorry!

I want to convert body tracking to screen coordinates.

How to draw a point on the screen after getting the position coordinates of the Head joint.

Could you please tell me, thank you very much.

///////////////////////////////////////// var calibration = device.GetCalibration(deviceConfig.DepthMode, deviceConfig.ColorResolution); var trackerConfig = new TrackerConfiguration(); trackerConfig.ProcessingMode = TrackerProcessingMode.Gpu; trackerConfig.SensorOrientation = SensorOrientation.Default; using(var tracker = Tracker.Create(calibration, trackerConfig)) { var wantExit = false;

            while (!wantExit)
            {
                // Capture a depth frame
                using (Capture sensorCapture = device.GetCapture())
                {
                    // Queue latest frame from the sensor.
                    tracker.EnqueueCapture(sensorCapture);
                }

                // Try getting latest tracker frame.
                using (Frame frame = tracker.PopResult(TimeSpan.Zero, throwOnTimeout: false))
                {
                    if (frame != null)
                    {
                        Console.WriteLine("NumberOfBodies : " + frame.NumberOfBodies);

                        if(frame.NumberOfBodies > 0)
                        {
                            var skeleton = frame.GetBodySkeleton(0);
                            var headJoint = skeleton.GetJoint(JointId.Head);
                            Console.WriteLine("  Head : x=" + headJoint.Position.X + ",y=" + headJoint.Position.Y + ",z=" + headJoint.Position.Z);
                        }
                    }
                }

                Console.WriteLine("Press Esc to Exit. Any other key to continue.");
                var key = Console.ReadKey();
                if (key.Key.Equals(ConsoleKey.Escape)) wantExit = true;
            }
        }

use opencv circle function. position x and y to Point(x, y)

cv2.circle(img, Point(x, y), radian, color, thickness) cv2.circle(img, (447,63), 63, (0,0,255), -1)

check this site https://docs.opencv.org/master/d6/d6e/group__imgproc__draw.html#gaf10604b069374903dbd0f0488cb43670

or Google it.

TomitaTseng commented 3 years ago

use opencv circle function. position x and y to Point(x, y)

cv2.circle(img, Point(x, y), radian, color, thickness) cv2.circle(img, (447,63), 63, (0,0,255), -1)

check this site https://docs.opencv.org/master/d6/d6e/group__imgproc__draw.html#gaf10604b069374903dbd0f0488cb43670

or Google it.

Thank you for your guidance. My English is not good, so I didn't convey the meaning, sorry! I use C# to develop the program. I got the X.Y.Z coordinates of the head joint positions, how can I convert them to the X.Y coordinates of the screen? Then draw a point in the X.Y coordinates of the found screen Can you tell me, thank you very much.

yeongwoonIm commented 3 years ago

use opencv circle function. position x and y to Point(x, y) cv2.circle(img, Point(x, y), radian, color, thickness) cv2.circle(img, (447,63), 63, (0,0,255), -1) check this site https://docs.opencv.org/master/d6/d6e/group__imgproc__draw.html#gaf10604b069374903dbd0f0488cb43670 or Google it.

Thank you for your guidance. My English is not good, so I didn't convey the meaning, sorry! I use C# to develop the program. I got the X.Y.Z coordinates of the head joint positions, how can I convert them to the X.Y coordinates of the screen? Then draw a point in the X.Y coordinates of the found screen Can you tell me, thank you very much.

I think you already got that point x, y when you got the X.Y.Z coordinates of the head joint positions. Z means how far away object to KineckDK. so, you can use it X,Y to your screen.

juspus commented 3 years ago

@qm13 are there any news regarding our last correspondence? I am still experiencing the same issue as stated before. Might I be selecting the wrong depth pixels? Since I cannot get the exact depth at the points I am trying to find due to occlusion.

TomitaTseng commented 3 years ago

I think you already got that point x, y when you got the X.Y.Z coordinates of the head joint positions. Z means how far away object to KineckDK. so, you can use it X,Y to your screen.

Thank you for your guidance. I have tested it. The 3D coordinates (x, y) of the head joint positions are not the same as the x, y coordinates on the screen. Does your test result have the same x and y coordinates?

juspus commented 3 years ago

This issue still hasn't been resolved, Is there any development on your end @qm13