puzzlepaint / camera_calibration

Accurate geometric camera calibration with generic camera models
BSD 3-Clause "New" or "Revised" License
710 stars 119 forks source link

The problem about image resize and image distortion correction #22

Open lingbo-yu opened 4 years ago

lingbo-yu commented 4 years ago

hi, I have tried the camera generic model in my project, most of the scenes make sense! But I found that there are some scenes such as the image needs to be resized or distortion correction is need for line detection, I do not know very clearly how to use this camera generic model. Is there some way to do these things? Thanks very much for your help !

puzzlepaint commented 4 years ago

Hi, I am not sure whether I understand your questions correctly.

Let me know if this does not answer your questions.

lingbo-yu commented 4 years ago

Hi, thanks for your answer !

Is there a similar method for the generic model to complete the correction?

puzzlepaint commented 4 years ago

Yes, as long as the central generic model is used, the algorithm that I tried to describe with pseudo-code should achieve exactly this. If the non-central generic model is used, then this cannot be done like this in general however.

lingbo-yu commented 4 years ago

hi@puzzlepaint , I'm still not understand, so how can I un-project a pixel to a direction from your description(- Un-project the pixel coordinate of p to a direction d) ? I'm wondering whether to use some prior knowledge, such as focal-length, etc. I have tried the Unproject () and Project() function in the demo, and the pixel return is the same with input.

puzzlepaint commented 4 years ago

Notice that the successive Un-projection and Projection operations in the undistortion algorithm given above operate with different camera models. If you use the same model for both, of course the same pixel as the input should be returned (otherwise, there would be a problem with the model). However, in the undistortion algorithm, the un-projection would use the pinhole model which was (arbitrarily) chosen as undistortion target, and the projection would use the calibrated generic model. There is no implementation of a pinhole camera model in the provided demo code, but it is very simple to implement based on the focal length and projection center parameters.

painterdrown commented 3 years ago

Hi @puzzlepaint, I am wondering how the pseudo-code achieve the undistortion.

For each pixel p in this pinhole calibration:
a. Un-project the pixel coordinate of p to a direction d using the pinhole model
b. Project this direction d to a pixel coordinate p2 using the centeral-generic model
c. Interpolate the pixel color at the obtained pixel coordinate p2 in the original image, and set the current pixel p in the undistorted image to this color

I am not sure what "this pinhole calibration" refers to, is it the undistorted image? If so, in step a, I think that the pixel p should firstly be distorted to p_d in the original image, then we apply the pinhole camera matrix to p_d to get its corresponding direction, am I right?

My understanding is: pixel p is the undistorted position with unknown color, after un-projection (pinhole) and projection (central-generic) we find its corresponding pixel p2 in the original image to fill the color. If it's true, then the pinhole model does the undistortion (in step a) instead of the central-generic model.

puzzlepaint commented 3 years ago

I am not sure what "this pinhole calibration" refers to, is it the undistorted image?

Yes, it is the pinhole model that was chosen for the undistorted image.

If so, in step a, I think that the pixel p should firstly be distorted to p_d in the original image, then we apply the pinhole camera matrix to p_d to get its corresponding direction, am I right?

I don't think so. Pixel p is supposed to be in the undistorted (pinhole) image, so the first step needs to be to apply the inverse pinhole camera matrix (i.e., unproject it). This returns the direction corresponding to this pixel. Then the direction can be projected to the distorted image using the calibrated generic model in order to look up the color.

My understanding is: pixel p is the undistorted position with unknown color, after un-projection (pinhole) and projection (central-generic) we find its corresponding pixel p2 in the original image to fill the color.

That sounds correct to me.

If it's true, then the pinhole model does the undistortion (in step a) instead of the central-generic model.

That also sounds correct to me if replacing "undistortion" by "unprojection". Step a in the algorithm does refer to the pinhole model (maybe that was not sufficiently clear in the original formulation).

painterdrown commented 3 years ago

Thanks for your reply @puzzlepaint. Now I'll say that there is only one divergence left: can the pixel p in the undistorted image be directly un-projected?

Here is how I understand the imaging in the pinhole model (as described in OpenCV),

  1. A point p_w (in the 3D world coordinate) is firstly projected to p_d (distorted pixel in the 2D image coordinate). The projection is done by applying the homography matrix H to p_w, that is *p_d = H p_w**.
  2. The distorted p_d is in the original image, then we use the distortion coefficients to calculate its undistorted position p.

so the first step needs to be to apply the inverse pinhole camera matrix (i.e., unproject it).

I believe that we can get the direction after un-projecting p_d instead of p by applying the inverse pinhole camera matrix. Could you give more details about how to "unproject it"?

puzzlepaint commented 3 years ago

The undistorted image is supposed to not have any distortion. This is why the "pinhole model" chosen for the undistorted image should have no distortion coefficients. It will only have focal length parameters (f_x, f_y) and optical center parameters (c_x, c_y). Thus, no distortion handling is necessary on the side of the undistorted image. The unprojection is simply done by applying the inverse pinhole matrix to pixel (p.x, p.y)^T to get the direction ((p.x - c_x) / f_x, (p.y - c_y) / f_y, 1)^T.

In the pseudo-code algorithm, the undistorted model and the calibrated generic model are two completely independently chosen / calibrated models. This differs from the two steps you cited, where, if I understand it correctly, seemingly the distortion parameters of a calibrated parametric model may be inversely applied to get some undistorted image. This cannot be used for a generic model (since it has no separate pinhole and distortion parameters), but the algorithm presented in pseudo-code above works just fine.

painterdrown commented 3 years ago

The unprojection is simply done by applying the inverse pinhole matrix to pixel (p.x, p.y)^T to get the direction ((p.x - c_x) / f_x, (p.y - c_y) / f_y, 1)^T.

Could you explain how the direction makes sense? If I understand your paper correctly, the calibrated directions are related to original pixels instead of undistorted pixels.

I think that pixel p should be distorted to p_d based on distortion coefficients. Then applying: *p_w = H^(-1) p_d**, the direction starts at (0, 0, 0) and points to p_w.

puzzlepaint commented 3 years ago

Could you explain how the direction makes sense?

This direction makes sense because it only depends on the arbitrarily chosen (!) pinhole intrinsics of the desired undistorted image. It has nothing to do with the generic calibration.

In general, starting from a calibrated central camera model, you can choose any kind of arbitrary other central camera model, and an arbitrary orientation for it, and reproject the pixels from the calibrated model into the arbitrary other model if both are located at the same optical center. (This is because central camera models basically just represent observation directions that start from a single point, the optical center. Reprojecting an image into a different central model just re-samples the image pixels at arbitrary different directions.)

With typical parametric camera models, they happen to be defined in a way that the distortion component and the "pixel mapping" component (with the f_x, f_y, c_x, c_y parameters) can be applied separately. But there is no need to restrict oneself to these same f_x, f_y, c_x, c_y parameters, and the same orientation, for the undistortion.

I would suggest to simply try out the proposed undistortion algorithm, then you should see that it works correctly.

painterdrown commented 3 years ago

Hi @puzzlepaint, I've tried the pseudo-code on the Tango data you provided at #46. Here is what I've done:

  1. Calibrated a central_generic result.
  2. Calibrated a central_opencv result (using the same features_15px.bin).
  3. Implemented the pseudo-code you mentioned above.
void bilinear_interpolate(const cv::Mat &src_img, const double src_x, const double src_y, uchar &color) {
  double src_x_f = floor(src_x);
  double src_x_c = ceil(src_x);
  double src_y_f = floor(src_y);
  double src_y_c = ceil(src_y);
  double w1 = (src_x_c - src_x) / (src_x_c - src_x_f);
  double w2 = (src_x - src_x_f) / (src_x_c - src_x_f);
  double w3 = (src_y_c - src_y) / (src_y_c - src_y_f);
  double w4 = (src_y - src_y_f) / (src_y_c - src_y_f);
  uchar q12 = src_img.at<uchar>(src_y_f, src_x_f);
  uchar q22 = src_img.at<uchar>(src_y_f, src_x_c);
  uchar q11 = src_img.at<uchar>(src_y_c, src_x_f);
  uchar q21 = src_img.at<uchar>(src_y_c, src_x_c);
  color = w3 * (w1 * q12 + w2 * q22) + w4 * (w1 * q11 + w2 * q21);
}

bool test_undistortion() {
  // Load the central_generic model.
  const char* file_path = "/Users/rankencheng/Downloads/raxel/tango/result_15px_central_generic_20/intrinsics0.yaml";
  CentralGenericCamera<double> camera;
  std::string error_reason;
  bool result = camera.Read(file_path, &error_reason);
  if (!result) {
    std::cout << "Reading the camera failed! Reason: " << error_reason << std::endl;
    return false;
  }

  // Load fx, fy, cx, cy from the central_opencv model.
  double parameters[12] = { 277.43476524575, 277.29416182849, 307.81265362247, 233.0392096016, 0.93640031669187, 0.35692232820102, 0.015722514959806, 1.2608235807082, 0.59007496453161, 0.079399687311869, -0.00027415199228906, 5.6684176272052e-05 };
  const double& fx = parameters[0];
  const double& fy = parameters[1];
  const double& cx = parameters[2];
  const double& cy = parameters[3];

  cv::Mat src_img = cv::imread("/Users/rankencheng/Downloads/raxel/tango/image000000_f.png", cv::IMREAD_GRAYSCALE);
  cv::Mat dst_img(src_img.rows * 2, src_img.cols * 2, CV_8U);  // double the width and height to obtain larger FOV
  const int x_begin = 0 - camera.width() * 0.5;
  const int x_end = camera.width() * 1.5;
  const int y_begin = 0 - camera.height() * 0.5;
  const int y_end = camera.height() * 1.5;

  // For each pixel p in this pinhole calibration.
  for (int x = x_begin; x < x_end; ++x) {
    for (int y = y_begin; y < y_end; ++y) {

      // Un-project the pixel coordinate of p to a direction d.
      // The unprojection is simply done by applying the inverse pinhole matrix to pixel (p.x, p.y)^T to get the direction ((p.x - c_x) / f_x, (p.y - c_y) / f_y, 1)^T.
      CentralGenericCamera<double>::PointT d((x - cx) / fx, (y - cy) / fy, 1);

      // Project this direction d to a pixel coordinate p2 using the calibrated camera model.
      CentralGenericCamera<double>::PixelT p2;
      camera.Project(d, &p2);

      // Interpolate the pixel color at the obtained pixel coordinate p2 in the original image, and set the current pixel p in the undistorted image to this color.
      // Bilinear interpolation is used.
      uchar color;
      double src_x = p2(0, 0);
      double src_y = p2(1, 0);
      if (src_x < 0 || src_x >= src_img.cols) continue;
      if (src_y < 0 || src_y >= src_img.rows) continue;
      bilinear_interpolate(src_img, src_x, src_y, color);
      dst_img.at<uchar>(y + camera.height() * 0.5, x + camera.width() * 0.5) = color;

    }
  }

  cv::imwrite("/Users/rankencheng/Downloads/raxel/test.png", dst_img);
  return true;
}

It seems that the algorithm works!

By the way, I was wrong about the calibrated directions so that I made the mistake:

I think that pixel p should be distorted to p_d based on distortion coefficients. Then applying: p_w = H^(-1) * p_d, the direction starts at (0, 0, 0) and points to p_w.

The direction ((p.x - c_x) / f_x, (p.y - c_y) / f_y, 1)^T makes sense because it's exact the resulted point (this point can be regarded as the direction) by applying the inverse camera intrinsics matrix to the pixel.

I will conclude that the pixel to be un-projected should be the undistorted one instead of the distorted one in the pinhole model, while the pixel after projection should become distorted in the generic model.

Thank you very much!

Duncan1115 commented 3 years ago

Hi, @painterdrown Thank u for writing this code. It seems to take a long time to calculate all the pixels of undistorted image. Do I use it in a wrong way?

puzzlepaint commented 3 years ago

The Project() function uses an optimization process internally, which is expected to be slow. If your camera calibration does not change too often, it is thus likely helpful to cache the result of the unprojection->projection process for each pixel in the camera image for a given camera calibration. This then allows to compute undistorted images very quickly for this calibration.

Duncan1115 commented 3 years ago

Hi @puzzlepaint, @painterdrown, I have cached the result of the unprojection->projection process at your suggestion. The code compute undistorted images in 24 Hz in 640x480 resolution. Here is what I've done:

  1. Cache the result to a Json file
  2. Read the Json file before computing undistorted images

json_generate.cpp

#include <opencv2/opencv.hpp>
#include <opencv2/calib3d/calib3d.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <stdio.h>
#include <iostream>
#include <jsoncpp/json/json.h>
#include <fstream>  
#include "/data/duncan/camera_calibration-master/applications/camera_calibration/generic_models/src/central_generic.h"
#include "/data/duncan/camera_calibration-master/applications/camera_calibration/generic_models/src/noncentral_generic.h"

using namespace std;
using namespace cv;

bool json_gen() {
  // Load the central_generic model.
  const char* file_path = "/data/duncan/camera_calibration-master/tango_test/result_15px_central_generic_20/intrinsics0.yaml";
  CentralGenericCamera<double> camera;
  std::string error_reason;
  bool result = camera.Read(file_path, &error_reason);
  if (!result) {
    std::cout << "Reading the camera failed! Reason: " << error_reason << std::endl;
    return false;
  }

  // Load fx, fy, cx, cy from the central_opencv model.
  double parameters[12] = { 277.43476524575, 277.29416182849, 307.81265362247, 233.0392096016, 0.93640031669187, 0.35692232820102, 0.015722514959806, 1.2608235807082, 0.59007496453161, 0.079399687311869, -0.00027415199228906, 5.6684176272052e-05 };
  const double& fx = parameters[0];
  const double& fy = parameters[1];
  const double& cx = parameters[2];
  const double& cy = parameters[3];

  cv::Mat src_img = cv::imread("/data/duncan/camera_calibration-master/tango_test/image000000_f.png", 
  cv::IMREAD_GRAYSCALE);
  cv::Mat dst_img(src_img.rows * 2, src_img.cols * 2, CV_8U);  // double the width and height to obtain larger FOV
  const int x_begin = 0 - camera.width();
  const int x_end = camera.width();
  const int y_begin = 0 - camera.height();
  const int y_end = camera.height();

    ofstream os;
    Json::StyledWriter sw;
    Json::Value root;
    os.open("data1.json", std::ios::out | std::ios::app);
    if (!os.is_open())
        cout << "error:can not find or create the file which named \" data1.json\"." << endl;

  int num = 0;

  // For each pixel p in this pinhole calibration.
  for (int x = x_begin; x < x_end; ++x) {
    for (int y = y_begin; y < y_end; ++y) {
      // Un-project the pixel coordinate of p to a direction d.
      // The unprojection is simply done by applying the inverse pinhole matrix to pixel (p.x, p.y)^T to get the direction ((p.x - c_x) / f_x, (p.y - c_y) / f_y, 1)^T.
      CentralGenericCamera<double>::PointT d((x - cx) / fx, (y - cy) / fy, 1); 

      // Project this direction d to a pixel coordinate p2 using the calibrated camera model.
      CentralGenericCamera<double>::PixelT p2;
      camera.Project(d, &p2); 

      uchar color;
      double src_x = p2(0, 0);
      double src_y = p2(1, 0);
      double src_x_f = floor(src_x); 
      double src_x_c = ceil(src_x);  
      double src_y_f = floor(src_y);
      double src_y_c = ceil(src_y);
      double w1 = (src_x_c - src_x) / (src_x_c - src_x_f);
      double w2 = (src_x - src_x_f) / (src_x_c - src_x_f);
      double w3 = (src_y_c - src_y) / (src_y_c - src_y_f);
      double w4 = (src_y - src_y_f) / (src_y_c - src_y_f);

      // cache value to json
      root[num].append(src_x);
      root[num].append(src_y);
      root[num].append(src_x_f);
      root[num].append(src_x_c); 
      root[num].append(src_y_f);
      root[num].append(src_y_c);
      root[num].append(w1);
      root[num].append(w2);
      root[num].append(w3);
      root[num].append(w4);
      num++;
    }
  }
  os << sw.write(root); 
    os.close();
  return true;
}

img_undistort.cpp

#include <opencv2/opencv.hpp>
#include <opencv2/calib3d/calib3d.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <stdio.h>
#include <iostream>
#include <jsoncpp/json/json.h>
#include <fstream>  
#include <ctime>
#include "/data/duncan/camera_calibration-master/applications/camera_calibration/generic_models/src/central_generic.h"
#include "/data/duncan/camera_calibration-master/applications/camera_calibration/generic_models/src/noncentral_generic.h"

using namespace std;
using namespace cv;

clock_t time_start,time_end;

void bilinear_interpolate(const cv::Mat &src_img, const int src_x_f, const int src_x_c, const int src_y_f, const int src_y_c, 
                          const double w1, const double w2, const double w3,  const double w4, uchar &color) {

  uchar q12 = src_img.at<uchar>(src_y_f, src_x_f);
  uchar q22 = src_img.at<uchar>(src_y_f, src_x_c);
  uchar q11 = src_img.at<uchar>(src_y_c, src_x_f);
  uchar q21 = src_img.at<uchar>(src_y_c, src_x_c);

  color = w3 * (w1 * q12 + w2 * q22) + w4 * (w1 * q11 + w2 * q21); 
}

Mat test_undistortion(const char* file_path, string img_path) {
  cv::Mat src_img = cv::imread(img_path, cv::IMREAD_GRAYSCALE);
  cv::Mat dst_img(src_img.rows * 2, src_img.cols * 2, CV_8U);  // double the width and height to obtain larger FOV
  CentralGenericCamera<double> camera;
  std::string error_reason;
  bool result = camera.Read(file_path, &error_reason);
  if (!result) {
    std::cout << "Reading the camera failed! Reason: " << error_reason << std::endl;
    return dst_img;
  }

  // Load fx, fy, cx, cy from the central_opencv model.
  double parameters[12] = { 277.43476524575, 277.29416182849, 307.81265362247, 233.0392096016, 0.93640031669187, 0.35692232820102, 0.015722514959806, 1.2608235807082, 0.59007496453161, 0.079399687311869, -0.00027415199228906, 5.6684176272052e-05 };
  const double& fx = parameters[0];
  const double& fy = parameters[1];
  const double& cx = parameters[2];
  const double& cy = parameters[3];

  const int x_begin = 0 - camera.width() ;
  const int x_end = camera.width() ;
  const int y_begin = 0 - camera.height() ;
  const int y_end = camera.height() ;
  const int n = src_img.rows * 2;
  const int m = src_img.cols * 2;

    Json::Reader reader;
    Json::Value root;

  cout<<"start reading"<<endl;
    ifstream in("data1.json", ios::binary);
    if (!in.is_open())
    {
        cout << "Error opening file\n";
        return dst_img;
    }
  reader.parse(in, root); 

  int *src_x_f,*src_x_c,*src_y_f,*src_y_c;
  double *w1,*w2,*w3,*w4;

  src_x_f = (int*)malloc(sizeof(int)* n * m);
  src_x_c = (int*)malloc(sizeof(int)* n * m);
  src_y_f = (int*)malloc(sizeof(int)* n * m);
  src_y_c = (int*)malloc(sizeof(int)* n * m);
  w1 = (double*)malloc(sizeof(double)* n * m);
  w2 = (double*)malloc(sizeof(double)* n * m);
  w3 = (double*)malloc(sizeof(double)* n * m);
  w4 = (double*)malloc(sizeof(double)* n * m);

  int num = 0;
  for (int x = x_begin; x < x_end; ++x) {
    for (int y = y_begin; y < y_end; ++y) {
      // double src_x = root[num][0].asDouble(); 
      // double src_y = root[num][1].asDouble();  
      // if (src_x < 0 || src_x >= src_img.cols) continue;
      // if (src_y < 0 || src_y >= src_img.rows) continue;
      src_x_f[num] = root[num][2].asInt(); 
      src_x_c[num] = root[num][3].asInt();  
      src_y_f[num] = root[num][4].asInt();
      src_y_c[num] = root[num][5].asInt();
      w1[num] = root[num][6].asDouble();
      w2[num] = root[num][7].asDouble();
      w3[num] = root[num][8].asDouble();
      w4[num] = root[num][9].asDouble();
      num++;     
    }
  }

  cout<<"start calculating"<<endl;

  uchar color = 0;
  num = 0;
  time_start=clock();
  // For each pixel p in this pinhole calibration.
  for (int x = x_begin; x < x_end; ++x) {
    for (int y = y_begin; y < y_end; ++y) {      
      bilinear_interpolate(src_img, src_x_f[num], src_x_c[num], src_y_f[num], src_y_c[num], w1[num], w2[num], w3[num], w4[num], color);
      dst_img.at<uchar>(y + camera.height() , x + camera.width() ) = color; 
      num++;
    }
  }

  time_end=clock();
  double endtime=(double)(time_end-time_start)/CLOCKS_PER_SEC;
    cout<<"Total time:"<<endtime<<endl; 

  return dst_img;
}

int main(){
    // Load the central_generic model.
    const char* file_path = "/data/duncan/camera_calibration-master/tango_test/result_15px_central_generic_20/intrinsics0.yaml";
    string img_path = "/data/duncan/camera_calibration-master/tango_test/image000000_f.png";
    Mat undistort_img;
    undistort_img = test_undistortion(file_path, img_path);
    cv::imwrite("/data/duncan/camera_calibration-master/test1.png", undistort_img);
    return 0;
}

Thank you all very much! Maybe this code can be used in some real-time project. Regards, Duncan

Mohamed-Ali-Maher commented 1 year ago

@painterdrown

Thank you very much for your code

could you tell me why i get the error

CentralGenericCamera’ was not declared in this scope

thankyou