OpenKinect / libfreenect2

Open source drivers for the Kinect for Windows v2 device
2.07k stars 746 forks source link

Depth-color registration parameters are unknown #41

Closed floe closed 9 years ago

floe commented 10 years ago

I'm still trying to get the calibration/registration parameters sorted out, but I'm kinda stuck, so I'll try to write down what I think I know and hope that someone can provide some additional pointers...

The intrinsics for the depth camera seem pretty certain - for the two devices I own, they are:

depth camera intrinsic parameters:
fx 368.096588, fy 368.096588, cx 261.696594, cy 202.522202
depth camera radial distortion coeffs:
k1 0.084061, k2 -0.271582, p1 0.000000, p2 0.000000, k3 0.101907

depth camera intrinsic parameters:
fx 365.402802, fy 365.402802, cx 260.925507, cy 205.594604
depth camera radial distortion coeffs:
k1 0.095575, k2 -0.277055, p1 0.000000, p2 0.000000, k3 0.097539

This is consistent with a camera having square pixels at a resolution of 512*424 (although cy seems a bit off-center to me).

Next are color camera intrinsics - there's a 25-value block which contains the following 5 initial values on both cameras:

1081.37 - might again be focal length in pixels - FOV?
 959.5  - exactly half the x resolution -> optical center x?
 539.5  - exactly half the y resolution -> optical center y?
 863    - not really a clue - perhaps x shift relative to depth cam?
  52    - perhaps y shift relative to depth cam?

About the 20 other values, I have no real clue, although part of them look like they might form a 4x4 transformation matrix...

Any additional comments/ideas?

floe commented 10 years ago

@steppobeck @christiankerl any ideas/comments? :-)

kohrt commented 10 years ago

The data I got from two different devices look similar to yours. So each device has his own ir camera parameters, while the color parameters are the same, at least the first 5. Do the other 20 vales are the same on both of your devices?

I calibrated both sensors to each other on both devices. The rotation between them is nearly the identity matrix and the translation looks like this: [ -0.05192784012425894, -0.0004530758522097678, 0.0007057198534333861 ] [ -0.05201715212410899, -0.0002404511811226246, 0.0002945446940294137 ]

The 52 could be the distance on x axis in mm. But I have no clue what the 863 could be. And the last 20 values are pretty small.

floe commented 10 years ago

IIRC the additional 20 values are slightly different on each device, I'll re-check.

Is the exact FOV of the color/depth cams known? FOV can be calculated from intrinsic parameters, so this could be used to double-check the 1081.37 value which I suspect to be focal length.

kohrt commented 10 years ago

These are the values I got from calibration the color sensor: fx 1060.707250708333, cx 956.354471815484 fy 1058.608326305465, cy 518.9784429882449

They are near to the ones provided by the device. So the first 3 are fx/fy, cx, cy

I also released a toolkit for using the kinect2 with ros. This contains an opencl and a cpu based depth registration and a calibration tool. here. maybe this is of use when implementing the registration directly in libfreenect2.

floe commented 10 years ago

That looks very useful, particularly since you already have an OpenCL-based implementation.

My multiple-view geometry is a bit rusty, what's the full set of parameters needed to translate from one camera space to the other? fx/fy, cx/cy for each camera + a 4x4 transformation matrix, IIRC?

kohrt commented 10 years ago

yes, you need the intrinsic (fx,fy,cx,cy, and the distortion coefficients to rectify the images) and extrinsic (translation and rotation) of the sensors. For each pixel of the depth image you can compute a 3d point and then transform it from depth coordinate system to the color coordinate system. Then you can project it back to the image plane of the color image. this results in a depth image from the perspective of the color camera.

jeppewalther commented 10 years ago

This might be helpful Mapping depth pixels with color pixels

floe commented 10 years ago

Excellent, very helpful. So to summarize, the steps + required parameters are:

So if we assume a regular 4x4 transformation matrix (16 values) for the 3rd step and 5 distortion coefficients for the 4th step, there's a total of 21 unknown values.

jeppewalther commented 9 years ago

I can confirm that the undistortion and backprojection parameters works very well.

floe commented 9 years ago

@jeppewalther good to know! With which codebase did you try this, and what values did you use for translation/rotation?

jeppewalther commented 9 years ago

As i mentioned i can only confirm Undistortion and Backprojection. So i did not use the translation/rotation parameters. If you find numbers for these parameters i will be happy to try them out.

I am doing a univercity project using PCL which i have hooked up with OpenKinect/libfreenect2. The codebase is not public at the moment.

The undistortion is surprisingly important as seen in the image below. Where undistortion have been applied to the x,y pixel coordinates.

Code for creating undistorted x,y cords using OpenCV

int width = 512;
int height = 424;

cv::Mat cv_img_cords = cv::Mat(1, width*height, CV_32FC2);
for (int r = 0; r < height; ++r) {
    for (int c = 0; c < width; ++c) {
        cv_img_cords.at<cv::Vec2f>(0, r*width + c) = cv::Vec2f((float)r, (float)c);
    }
}

cv::Mat k = cv::Mat::eye(3, 3, CV_32F);
k.at<float>(0,0) = config.camera_params.ir.fx;
k.at<float>(1,1) = config.camera_params.ir.fy;
k.at<float>(0,2) = config.camera_params.ir.cx;
k.at<float>(1,2) = config.camera_params.ir.cy;

cv::Mat dist_coeffs = cv::Mat::zeros(1, 8, CV_32F);
dist_coeffs.at<float>(0,0) = config.camera_params.ir.k1;
dist_coeffs.at<float>(0,1) = config.camera_params.ir.k2;
dist_coeffs.at<float>(0,2) = config.camera_params.ir.p1;
dist_coeffs.at<float>(0,3) = config.camera_params.ir.p2;
dist_coeffs.at<float>(0,4) = config.camera_params.ir.k3;

cv::Mat new_camera_matrix = cv::getOptimalNewCameraMatrix(k, dist_coeffs, cv::Size2i(height,width), 0.0);

cv::undistortPoints(cv_img_cords, cv_img_corrected_cords, k, dist_coeffs, cv::noArray(), new_camera_matrix);

Function for projection into 3D

bool ObjectTracker::depth_image_to_point_cloud(cv::Mat& depth, cv::Mat& xycords, pcl::PointCloud<PointT>::Ptr dst)
{
    //Process Depth To PCL Cloud
    uint pixel_count = depth.rows * depth.cols;
    dst->resize(pixel_count);
    dst->height = depth.rows;
    dst->width = depth.cols;

    float x = 0.0f, y = 0.0f;

    float* ptr = (float*) (depth.data);
    for (uint i = 0; i < pixel_count; ++i)
    {
        cv::Vec2f xy = xycords.at<cv::Vec2f>(0, i);
        x = xy[1]; y = xy[0];

        PointT point;
        point.z = (static_cast<float>(*ptr)) / (1000.0f); // Convert from mm to meters
        point.x = (x - config.camera_params.ir.cx) * point.z / config.camera_params.ir.fx;
        point.y = (y - config.camera_params.ir.cy) * point.z / config.camera_params.ir.fy;
        point.r = 128; point.g = 128; point.b = 128;
        dst->at(i) = point;

        ++ptr;
    }
}

My IR Camera parameters as given by libfreenect2

struct camera_params_t
{
    struct ir_t{
        float cx = 254.878f;
        float cy = 205.395f;
        float fx = 365.456f;
        float fy = 365.456f;
        float k1 = 0.0905474;
        float k2 = -0.26819;
        float k3 = 0.0950862;
        float p1 = 0.0;
        float p2 = 0.0;
    }ir;
}camera_params;

Result

kohrt commented 9 years ago

And from where did you get the translation and rotation? by calibration? And the IR intrinsic are the ones provided by the sensor? For the rectification you can also use cv::initUndistortRectifyMap and cv::remap. The first one creates a mapping between the undistorted and distorted image, and the second one applies the mapping to an image.

jeppewalther commented 9 years ago

I don´t use translation and rotation for anything. I only need them when working with the RGB camera correct ?. I simply do the coloring based on the IR image.

Ahh now i see the problem .. i am missing a "not" in my previous post.

"So i did use the translation/rotation parameters" should be "So i did not use the translation/rotation parameters"

Sorry, i will update the previous post.

jeppewalther commented 9 years ago

And yes all the parameters are acquired from the sensor.

vahids1639 commented 9 years ago

Hi every body,

I am new to Kinect 2 and the SDK. I could find the depth camera parameters but unfortunately could not find RGB camera parameters.

Would you please help me? In which part of the SDK I should search?

Thanks

sh0 commented 9 years ago

I have some good news, I believe I have figured out the depth-color parameters.

The important parameters reside in intrinsics[25] (RgbCameraParamsResponse structure). New layout is as follows:

struct s_intrinsics {
    float color_f;
    float color_cx;
    float color_cy;

    float shift_d;
    float shift_m;

    float mx_x3y0; // xxx // 0x40
    float mx_x0y3; // yyy // 0x44
    float mx_x2y1; // xxy // 0x48
    float mx_x1y2; // yyx // 0x4c
    float mx_x2y0; // xx  // 0x50
    float mx_x0y2; // yy  // 0x54
    float mx_x1y1; // xy  // 0x58
    float mx_x1y0; // x   // 0x5c
    float mx_x0y1; // y   // 0x60
    float mx_x0y0; //     // 0x64

    float my_x3y0; // xxx // 0x68
    float my_x0y3; // yyy // 0x6c
    float my_x2y1; // xxy // 0x70
    float my_x1y2; // yyx // 0x74
    float my_x2y0; // xx  // 0x78
    float my_x0y2; // yy  // 0x7c
    float my_x1y1; // xy  // 0x80
    float my_x1y0; // x   // 0x84
    float my_x0y1; // y   // 0x88
    float my_x0y0; //     // 0x8c
};

There are two other important parameters for calculations:

float depth_q = 0.01;
float color_q = 0.002199

The numeric values of those parameters are hardcoded defaults that I found. From structure layouts these values should have been sent between shift_m and mx_x3y0, but are missing. Unless someone finds those values in some packet I guess we can treat them as constants.

First we need to undistort the depth map:

void kinect2_undistort(float dx, float dy, float& mx, float& mx)
{
    float ps = (dx * dx) + (dy * dy);
    float qs = ((ps * depth_k3 + depth_k2) * ps + depth_k1) * ps + 1.0;
    for (int i = 0; i < 9; i++) {
        float qd = ps / (qs * qs);
        qs = ((qd * depth_k3 + depth_k2) * qd + depth_k1) * qd + 1.0;
    }
    mx = dx / qs;
    my = dy / qs;
}

Next we project the depth point to the color frame coordinates:

void kinect2_depth_to_color(float mx, float my, float z, float& rx, float& ry)
{
    mx *= depth_f * depth_q;
    my *= depth_f * depth_q;

    float wx =
        (mx * mx * mx * mx_x3y0) + (my * my * my * mx_x0y3) +
        (mx * mx * my * mx_x2y1) + (my * my * mx * mx_x1y2) +
        (mx * mx * mx_x2y0) + (my * my * mx_x0y2) + (mx * my * mx_x1y1) +
        (mx * mx_x1y0) + (my * mx_x0y1) + (mx_x0y0);
    float wy =
        (mx * mx * mx * my_x3y0) + (my * my * my * my_x0y3) +
        (mx * mx * my * my_x2y1) + (my * my * mx * my_x1y2) +
        (mx * mx * my_x2y0) + (my * my * my_x0y2) + (mx * my * my_x1y1) +
        (mx * my_x1y0) + (my * my_x0y1) + (my_x0y0);

    rx = wx / (color_f * color_q);
    ry = wy / (color_f * color_q);

    rx += (shift_m / z) - (shift_m / shift_d);

    rx = rx * color_f + color_cx;
    ry = ry * color_f + color_cy;
}

I also have some knowledge about the internal LUTs, but they are quite uninteresting.

There are some implications of the way the color coordinates are calculated. Main thing is that there is no simple rotation and translation transform available to move from the color camera to depth camera space.

floe commented 9 years ago

Awesome, thanks for the information! Can you share how you found this? It looks like you already have a working implementation, could you perhaps send a pull request?

christiankerl commented 9 years ago

Nice work. Do you also have some insights on how to compute the xtable and ztable used for the depth computation?

sh0 commented 9 years ago

I explored the internals of kinect20.dll for about a week to get this info. Essentially the ICoordinateMapper interface contains all functions to convert between coordinate systems. The guess was that since there are so many different functions available in that interface, then the underlying transformations are probably also done nearby. I think I had to dig two classes deep starting from CoordinateMapper to find the core algorithms. The optimizations are not too heavy so it was mostly readable assembly except for the loop unrolling which was quite annoying in case of undistortion code. I think I also saw a single pass undistort somewhere and there seemed to be 5-pass undistort in the service exe, but it might be totally unrelated. Main data structure for the CoordinateMapper constructor came as a 0x84 byte block (hence block84) consisting all of floats which is transformed into 0x90 byte data block (hence block90) for use in calculations. I haven't really been able to trace when the block84 is copied into kinect20.dll. However the intrinsics[25] had somewhat similar layout as block84 so I could just make guesses and everything worked.

About the xTable and zTable. I'd really like to get those as well, but it seems bit more complicated. These tables are calculated somewhere deeper and as far as I know are not exposed nor used near the surface in any interface. It looks like that I need to turn the attention to vvtechs.dll which has bunch of shader code elements, including g_xTable and g_zTable. This dll doesn't have as many debug strings as kinect20.dll. Right now I have managed to nail down two table generation methods that fill 512x424 table, but didn't use much input data. I think one of the tables was yet another undistort code, but its starting to become suspicious - why is there so much distortion everywhere. What might help at this point is to gather info about what kind of tables are used in the original shaders including data types and sizes. This would be useful to identify what kind of table routines I'm possibly looking at. There are tons of loops using floating point ops so anything that enables me to narrow the selection down is useful.

For the record, I have not installed the SDK nor run it so I also haven't accepted any EULAs (in fact I don't even have Windows). I'm only doing static analysis.

About the code, I don't really have anything commitable right now. I guess the first question would also be that what kind of interface do we want to expose in libfreenect2. I think that at minimum we should have an interface for converting between coordinate spaces like the original SDK does. Should we expose point conversion or whole LUT table generation routines?

What I was demonstrating in the above image is a method of warping the color image to the viewpoint of the depth image using OpenGL. In previous Kinect 1 API the registration method did the opposite - transferred the depth image to the color camera coordinates. I don't know the details how this was done, but I probably have to check it out to understand things fully. The way I see it, there are two problems when transforming the depth map to color image using Kinect 2. First, the depth values need to be recalculated to be distance from the color camera not depth camera. This requires knowledge of translation between cameras. I don't really seem to grasp all the implications of the discovered transformation, but we might be able to use the x shift for the camera coordinate translation info. The second implication is that the depth map will be huge, but just upsampled version so it will be computationally expensive to process for any application yet it doesn't include more information than the original depth map.

christiankerl commented 9 years ago

nice :) I guess the two table generation methods are the right ones. There are basically 5 tables used in depth map compution 3 P0Tables, which come directly from the device, and the xTable and zTable, which also seem to be some kind of polynomial (the coefficients are probably in DepthCameraParamsResponse or P0TablesResponse at the end). One dump of a xTable and zTable are in the examples/protonect/ folder. If you plot them they are very smooth.

floe commented 9 years ago

I think a LUT table generation routine would be best, because that can be efficiently re-used for OpenGL, OpenCL and CPU. Also, if color->depth is the primary transformation provided by the official SDK, then I would suggest to keep it that way. (This is exactly how it was for the Kinect 1 SDK: the algorithm from the OpenNI SDK & coefficients from the camera provided a depth->color mapping first; the inverse direction was added later.)

kohrt commented 9 years ago

Great news! I am impressed. Regarding the registration: I do a depth -> color registration. You can see the code here https://github.com/code-iai/iai_kinect2. I tried the other way, but that introduced the ghosting effects, so I decided to not use it. If you compute 3d points out of the depth data and then rotate and translate them into the color frame, you should end up with a new point p'. The z value of p' is also changed by the transformation and should be the distance in z from the color frame. Because of the complexity I compute a low and a high resolution depth image. the low one is a quarter FullHD. So for processing the whole cloud/depth image one could use the low res version and to further analyze sub regions one could change to the high res version, since coordinated between those two are simply doubled/halfed. And yes, if you upscale the depth, you will not get additional depth information but a higher color resolution. It would definitely make sense to have a plug and play registration inside libfreenect2, that uses the parameters provided by each sensor, like the SDK. But also maintain access to the unregistered images to let users do more complex registrations using calibrated in- and extrinsic parameters. Or even include those registration methods as optional functionality.

floe commented 9 years ago

JFYI, I'm currently working on an implementation of the registration LUTs in https://github.com/floe/libfreenect2/compare/registration?expand=1 - comments are welcome, pull request to be submitted soon.

floe commented 9 years ago

Since the original topic of this issue has been dealt with (parameters are known now, discussion regarding xTable/zTable continues at #144), I would suggest to close this?

lolz0r commented 9 years ago

I pulled the code from your repro and verified that it built and that the protonect example runs correctly. Are there any plans to add this code to the example?

floe commented 9 years ago

@lolz0r thanks for testing, I will merge the current PR into master and close this issue for the time being (can always be reopened anyway). I'll try to add a 4th view (coloured depth image) to the example soon-ish.

anshulvj commented 9 years ago

@sh0, if the function undistort_depth, is mx and my something that I directly add to x and y of a pixel to undistort it, or do I use it some other way? I noticed that mx my is usually between -1 and 1 so I'm guessing it tells me how far from the origin is my actual origin (i.e. instead of 0,0 it might be -0.14, 0.25 etc) and then I can move it to the image coordinates. Is there any resource online where I can read up how this function came by? Please note that I'm not trying to register, I'm simply trying to undistort the depth image.

sh0 commented 9 years ago

I don't think there is any resource available except this thread and the source code of libfreenect2. The undistort_depth has following inputs:

dx = (depth_x - depth_cx) / depth_f;
dy = (depth_y - depth_cy) / depth_f;

Here the {depth_x, depth_y} are pixel coordinates in depth map, {depth_cx, depth_cy} the principal point coordinates and depth_f the focal length of the depth camera. Essentially {dx, dy} are coordinates in depth space and the output {mx, my} are coordinates in camera space.

anshulvj commented 9 years ago
dx = (depth_x - depth_cx) / depth_f;
dy = (depth_y - depth_cy) / depth_f;

are the coordinates in depth space upto a scale I believe. To get the true x, y in camera coordinate frame you gotta multiply by the depth z for point (x,y). I'm already getting that correctly by using the formula here multiplied by depth(x,y)., i.e.

dx = (depth_x - depth_cx) * depthImage(depth_x, depth_y) / depth_f;
dy = (depth_y - depth_cy) * depthImage(depth_x, depth_y) / depth_f;

But I am not yet using the distortion coefficients as used in the formula:

float qs = ((ps * depth_k3 + depth_k2) * ps + depth_k1) * ps + 1.0;
    for (int i = 0; i < 9; i++) {
        float qd = ps / (qs * qs);
        qs = ((qd * depth_k3 + depth_k2) * qd + depth_k1) * qd + 1.0;
    }
    mx = dx / qs;
    my = dy / qs;

I don't know what to do with mx and my. Because although dx and dy is used to generate those, mx and my are in the range [-1,1] so I don't know whether I should be using them to undistort first and then using the dx, dy formula to get the real x,y, or whether I should be using them somehow to correct (undistort) the dx,dy I get from the first formula?

floe commented 9 years ago

@anshulvj the Registration class in current master already provides an undistorted depth image?

anshulvj commented 9 years ago

Sorry for the late reply, I was off for a week. So I checked the latest registration.cpp file here and boy has it changed! (I'm still using the one where the undistort function is as I've given in my previous comment). I'm guessing @sh0 and @floe 's comment was referring to the latest undistort function and not the function I had in my previous comment? Btw, why is the function named distort, and not undistort? Is it distorting the undistorted ir image provided by the kinect, so that we can register ir with color, or is it just a naming thing and it's actually undistorting the distorted ir image? I checked here and it looks like it's a distortion function and not undistortion. So I'm a bit confused by @floe 's comment about the fact that it already provides a undistorted depth image. So should I not be worrying about undistorting and simply using the depth image given directly by the Kinect? My main goal is to get the x and the y value of a depth pixel as accurately as possible; which means if the the lens distortions are accounted and corrected for, that would be the most accurate estimate of a world point (x,y,z) represented in the camera frame and I would use these points.

kohrt commented 9 years ago

It's called distort, because is computes for undistorted image coordinates the corresponding distorted coordinates.

anshulvj commented 9 years ago

(I just edited my previous comment for what I need). Which means that I shouldn't be using the distort function from registration at all, and just stick to the undistorted depth image in protonect.cpp. I apologize for my ignorance; I haven't keep my highly modified code in sync with the master. I will clone the new code.

anshulvj commented 9 years ago

I tested the new libfreenect2 code and it seems undistorted image is already provided, before we do any registration, so I don't really need to modify registration.

SilvioGiancola commented 9 years ago

Hi everyone,

Last comment from anshulvj make me re-think about LuT and how x and y values need to be obtained from depth. Since now, I have used the tables obtained through the GetDepthFrametoCameraSpaceTable function of the Win8 SDK, that I've saved in a file. I use the Depth cv::Mat from the Depth Frame, that I multiply by my X/Y table in order to obtain an X and Y cv::Mat that I can use for the 3D Points. Here an example of what I was doing:

libfreenect2::Frame *depth = frames[libfreenect2::Frame::Depth];
cv::Mat MatDepth = cv::Mat(depth->height, depth->width, CV_32FC1, depth->data) / 1000.0f;
cv::Mat MatX = cv::Mat(depth->height, depth->width, CV_32FC1);
cv::Mat MatY = cv::Mat(depth->height, depth->width, CV_32FC1);
cv::multiply(MatDepth, LookUpTableX, MatX);
cv::multiply(MatDepth, LookUpTableY, MatY);

PointCloudT::Ptr PC(new PointCloudT);

for (int i = 0; i< MatDepth.rows ;i++)
{
     for (int j = 0; j< MatDepth.cols ;j++)
     {
          if ( MatDepth.at<float>(i,j)!=0)
          {
              PointT P;
              P.x = MatX.at<float>(i,j);
              P.y = MatY.at<float>(i,j);
              P.z = MatDepth.at<float>(i,j);
              PC->push_back(P);
         }
    }
}

Since the registration provide an undistorted image, I though useful to avoid the LuT I recover from Win8, so I apply the registration and use the undistorted image and recover X and Y thought the formulas X = Z (cx - x)/fx and Y = Z (cy - y)/fy. Here my code:

libfreenect2::Frame *rgb = frames[libfreenect2::Frame::Color];
libfreenect2::Frame *depth = frames[libfreenect2::Frame::Depth];
libfreenect2::Frame undistorted(512, 424, 4), registered(512, 424, 4);

registration->apply(rgb,depth,&undistorted,&registered);

cv::Mat MatDepth = cv::Mat(depth->height, depth->width, CV_32FC1, depth->data) / 1000.0f;
cv::Mat MatUndist = cv::Mat(undistorted.height, undistorted.width, CV_32FC1, undistorted.data) / 1000.0f;

for (int i = 0; i< MatUndist.rows ;i++)
{
    for (int j = 0; j< MatUndist.cols ;j++)
    {
        if (MatUndist.at<float>(i,j)!=0 )
        {
            PointT P;
            P.z = MatDepth.at<float>(i,j);
            P.x = MatUndist.at<float>(i,j) * (j-paramIr.cx) / paramIr.fx;
            P.y = MatUndist.at<float>(i,j) * (i-paramIr.cy) / paramIr.fy;
            PC->push_back(P);
        }
    }
}

My doubt come from the fact that both method does not return the same results. Am I doing it right? While the Z coordinate is the same, an offset is present in X and Y. I have computed the mean distance between the 3D points with the 2 techniques, and a mean distance error of 3mm is present in both X and Y. Any chance to improve this result ? Anyone that may use PCL and 3D visualization have done / seen something similar?

Thank you for your time and your help.

floe commented 9 years ago

@SilvioGiancola it's entirely possible that the reverse-engineered code misses some minor aspects of the original implementation. For example, the undistortion as implemented by @wiedemeyer uses a slightly different algorithm than the original shader; I'm not sure if they are functionally wholly equivalent.

SilvioGiancola commented 9 years ago

@floe thank you for your answer, It is good to know that it not the exact same 3D points that are reconstructed with the current registration technique.

I have a way to find the original LuT, written in matlab code, that takes the IR params and minimize a function for every pixel, it may be useful if we want to confront the different results... I think this LuT may be hardcoded in the Kinect. Here the matlab code:

fx=366.8153;
fy=366.8153;
cc1=257.0098;
cc2=203.6805;
k2=0.09059434;
k4=-0.2688895;
k6=0.09351667;

options = optimoptions('fsolve','Display','iter');

x0=[-1;-1];
j=0;
for v=1:424
    for u=1:512
        fun=@(x) [fx * (1 + k2*(x(1)^2+x(2)^2) + k4*(x(1)^2+x(2)^2)^2 + k6*(x(1)^2+x(2)^2)^3)*x(1) + cc1 - (u-1);... 
                  fy * (1 + k2*(x(1)^2+x(2)^2) + k4*(x(1)^2+x(2)^2)^2 + k6*(x(1)^2+x(2)^2)^3)*x(2) - cc2 + (v-1)]; 

        [x,fval]=fsolve(fun,x0);

        j=j+1;

        a(j,1)=x(1);
        a(j,2)=x(2);
    end
end

I have tried to wrap it in a C++ code, I have tried with the Eigen library, but the Levenberg Marquardt function is part of the "Eigen-unsupported" and sometime diverge, while matlab doesn't. Does anyone can advise me a library for function minimization in C++? I would like to implement it, but I don't know if it can be useful for the community.

ahundt commented 9 years ago

ceres-solver works well. It can also run reasonably fast, especially if it is given a rough estimation first.

https://ceres-solver.googlesource.com/ceres-solver/ http://ceres-solver.org/features.html

Here is an example of ceres in use for hand eye calibration, a problem unrelated to this one, but might be show a useful example of mixing eigen code with ceres: https://github.com/hengli/camodocal/blob/master/src/calib/HandEyeCalibration.cc#L311

Starlord89 commented 7 years ago

why is it that parameters (focal length and principle point) differ per kinect2 device? Can someone explain or link to material that might be useful ?

floe commented 7 years ago

@Starlord89 the differences are very small anyway; they are simply due to manufacturing tolerances, i.e. the lens on one device might be shifted by, say, 0.05 mm relative to the lens on another device, and that results in slightly different camera parameters.

letuantang commented 5 years ago

Hello everyone, I'm a newbie with Kinectv2 but now I need to use it in my research. My problem is that I need to write Python code to solve the problem and I need to use the rgb color frame (1920x1080) rather than register the registered after registration. However, I don't know how to transform from (x,y) in color frame with high resolution to (X,Y) in small resolution (512x424). I tried with homography transformation matrix but the result was not good. I think because of IR camera and the RGB camera has different field of views (FOV). Can anyone give me some advice and how I can solve it using Python code? Thank you all for your help.