xoox / calibrel

More Accurate Pinhole Camera Calibration with Imperfect Planar Target. Not maintained anymore.
https://xoox.github.io/calibrel/
Other
15 stars 3 forks source link

Testing results are not good as expected according to the paper #1

Closed xoox closed 6 years ago

xoox commented 6 years ago

We've conducted tests of this calibrating method with test data of calibrel_testdata. The results are not as good as expected according to the paper. The new method is sometimes worse than OpenCV's method and sometimes better than the latter. But in general there is no significant improvements compared to OpenCV's standard method. The detailed test results are showed following.

Data set 1

OpenCV's method

<camera_matrix type_id="opencv-matrix">
  <rows>3</rows>
  <cols>3</cols>
  <dt>d</dt>
  <data>
    3.0670601055435177e+03 0. 5.4168384780123540e+02 0.
    3.0737503993310925e+03 5.3691339052892795e+02 0. 0. 1.</data></camera_matrix>
<distortion_coefficients type_id="opencv-matrix">
  <rows>5</rows>
  <cols>1</cols>
  <dt>d</dt>
  <data>
    -1.6114330409319458e-01 0. -2.2729685301187963e-03
    -3.6581288200174624e-03 0.</data></distortion_coefficients>
<avg_reprojection_error>3.1406583963136664e-01</avg_reprojection_error>

DLR11 method

<camera_matrix type_id="opencv-matrix">
  <rows>3</rows>
  <cols>3</cols>
  <dt>d</dt>
  <data>
    3.0209501913836739e+03 0. 6.3993993777133790e+02 0.
    3.0271840392465974e+03 5.1194779030932455e+02 0. 0. 1.</data></camera_matrix>
<distortion_coefficients type_id="opencv-matrix">
  <rows>5</rows>
  <cols>1</cols>
  <dt>d</dt>
  <data>
    -3.0555798992700906e-02 0. 2.4124468455964283e-03
    5.1681504633348035e-03 0.</data></distortion_coefficients>
<avg_reprojection_error>3.6778428836778632e-01</avg_reprojection_error>

Hybrid method

<camera_matrix type_id="opencv-matrix">
  <rows>3</rows>
  <cols>3</cols>
  <dt>d</dt>
  <data>
    3.0672888652406186e+03 0. 5.4174166778823223e+02 0.
    3.0724823332425444e+03 5.3686955699567227e+02 0. 0. 1.</data></camera_matrix>
<distortion_coefficients type_id="opencv-matrix">
  <rows>5</rows>
  <cols>1</cols>
  <dt>d</dt>
  <data>
    -1.0288404944496776e-01 0. -1.4505726936981191e-04
    -6.1541237963353961e-03 0.</data></distortion_coefficients>
<avg_reprojection_error>2.9886574412779127e-01</avg_reprojection_error>
xoox commented 6 years ago

Dr. Strobl's comments on the original test dataset:

I looked at your data and the reason probably is that your images have been taken "the OpenCV/Bouguet way." Instead, you should take more tilted images, avoiding perpendicular images (see https://www.robotic.dlr.de/fileadmin/robotic/stroblk/publications/strobl_2009iros_.pdf, page 311, bottom) and, if the software allows (DLR CalDe and DLR CalLab does), fill the image with corners (i.e., leave some corners outside the field of view). See our invaluable calibration hints in: https://www.dlr.de/rm/en/desktopdefault.aspx/tabid-3925/6084_read-9196/ Good calibration images look like this: Calibration image sample (Older) sample images can be found here: https://www.dlr.de/rm/en/Portaldata/52/Resources/Software/CalLab/sample.zip

Using these images it's difficult if not impossible to triangulate and recover the 3D geometry of the pattern, which is precisely what my algorithm is intended to. But not only my algorithm: the basic camera calibration method is also intended at separating the effects of the position of the camera and of the focal length. This is only possible through perspective distortion or external measurements (which we don't like). Hence any calibration with that kind of images may deliver a low RMS but the parameters estimated will be wrong.

Dr. Strobl's answer to the following question.

I've a long-standing unanswered question. If we use tilted images with large oblique angles, some parts of the image will be out of focus and hence become blurry. That might decrease the corner features detection accuracy, is it right? Is my suspicion unnecessary? Or do we have to get a balance between image defocus and oblique angle?

The problem you mention is only noticeable if your camera aperture is very big or the distance to the pattern is too small. We rarely have that problem -- I even suggest to take obliques images at two different distances. Even though calibration should not pose constraints in the cameras setup, consider:

  1. printing a bigger pattern and focusing at a more distant region
  2. lowering the aperture size

If you still have that problem, then you're right, you should get a balance between defocus and perspective distortion (oblique images). Note that CalDe does an awesome job at detecting corners even if the images are blurry.

Did you understand why oblique images are better? Else you cannot tell range from focal length!

The problem with oblique images is that the 2-D density of corners is higher in one side of the image than in the other, hence one region of the image gets better calibrated than the other. To cope with that problem you can take 4 images with obliques images up-down-right-left and get some symmetry.

xoox commented 6 years ago

Here is a dataset with 9 images. Feature points were detected using CalDe. In order to compare the results with our implementation, each image has the same number of features.

The link to the dataset. dataset5.zip

Both CalLab standard method and OpenCV gave similar results.

CalLab standard method:

camera.0.k1= -0.0634703
camera.0.p1= -0.00471115
camera.0.p2= 0.00448202
camera.0.A=[ 3089.14 0.000000 663.435; 0.000000 3099.48 480.222; 0.000000 0.000000 1.00000]
camera.0.rmsInt= 0.755889

OpenCV standard method:

<camera_matrix type_id="opencv-matrix">
  <rows>3</rows>
  <cols>3</cols>
  <dt>d</dt>
  <data>
    3.0891256444690321e+03 0. 6.6352137323491718e+02 0.
    3.0994725315433725e+03 4.8023634412293796e+02 0. 0. 1.</data></camera_matrix>
<distortion_coefficients type_id="opencv-matrix">
  <rows>5</rows>
  <cols>1</cols>
  <dt>d</dt>
  <data>
    -6.3476243923252254e-02 0. -4.7097377006577604e-03
    4.4901135146666543e-03 0.</data></distortion_coefficients>
<avg_reprojection_error>7.5588754387089074e-01</avg_reprojection_error>

CalLab with refining full object structure:

camera.0.k1= -0.0899712
camera.0.p1= -0.00169304
camera.0.p2= -0.000343253
camera.0.A=[ 3056.29 0.000000 596.089; 0.000000 3037.15 439.422; 0.000000 0.000000 1.00000]
camera.0.rmsInt= 0.151591

Our implementation without initial intrinsic parameters guess:

<camera_matrix type_id="opencv-matrix">
  <rows>3</rows>
  <cols>3</cols>
  <dt>d</dt>
  <data>
    3.0936665220239684e+03 0. 6.3993195978817482e+02 0.
    3.0998850523236238e+03 5.1212558075310881e+02 0. 0. 1.</data></camera_matrix>
<distortion_coefficients type_id="opencv-matrix">
  <rows>5</rows>
  <cols>1</cols>
  <dt>d</dt>
  <data>
    -6.9046083676842063e-02 0. -2.3558689963820191e-03
    1.4073143102732205e-03 0.</data></distortion_coefficients>
<avg_reprojection_error>6.8544169962641910e-01</avg_reprojection_error>

Our implementation using OpenCV standard method as initial intrinsic parameters guess:

<camera_matrix type_id="opencv-matrix">
  <rows>3</rows>
  <cols>3</cols>
  <dt>d</dt>
  <data>
    3.0894945313852149e+03 0. 6.6349963937561552e+02 0.
    3.0974661465090298e+03 4.8023370212223756e+02 0. 0. 1.</data></camera_matrix>
<distortion_coefficients type_id="opencv-matrix">
  <rows>5</rows>
  <cols>1</cols>
  <dt>d</dt>
  <data>
    -6.5867947154494330e-02 0. -4.2237024007513498e-03
    4.9095414646612217e-03 0.</data></distortion_coefficients>
<avg_reprojection_error>4.3044986222261095e-01</avg_reprojection_error>

Our implementation did not generate the optimized results as CalLab. Further improvements or fix are needed.

We also found that different fix 3D points (x1, x2, x3. See Dr. Strobl's paper, section 3.4) will result in much different results with divers RMS and camera parameters.

The camera matrix reported by CalLab with refining full object structure was also so much different than that reported by standard method.

xoox commented 6 years ago

A bug was fixed in 6da25267. Our implementation reached the same RMS level as CalLab. With the same dataset of above dataset5.zip, we got the following results.

CalLab with refining full object structure:

camera.0.k1= -0.0984749
camera.0.k2= 0.127567
camera.0.A=[ 3056.33 0.000000 584.655; 0.000000 3037.10 466.268; 0.000000 0.000000 1.00000]
camera.0.rmsInt= 0.151942

Our implementation:

<camera_matrix type_id="opencv-matrix">
  <rows>3</rows>
  <cols>3</cols>
  <dt>d</dt>
  <data>
    3.0573577737911560e+03 0. 5.9777231463549515e+02 0.
    3.0362037824002300e+03 4.5311355407991232e+02 0. 0. 1.</data></camera_matrix>
<distortion_coefficients type_id="opencv-matrix">
  <rows>5</rows>
  <cols>1</cols>
  <dt>d</dt>
  <data>
    -1.0343613204866929e-01 1.7854204424225634e-01 0. 0. 0.</data></distortion_coefficients>
<avg_reprojection_error>1.5178474259880476e-01</avg_reprojection_error>

CalLab with refining full object structure:

camera.0.k1= -0.0900870
camera.0.A=[ 3055.39 0.000000 582.201; 0.000000 3036.52 459.132; 0.000000 0.000000 1.00000]
camera.0.rmsInt= 0.152238

Our implementation:

<camera_matrix type_id="opencv-matrix">
  <rows>3</rows>
  <cols>3</cols>
  <dt>d</dt>
  <data>
    3.0555752306570489e+03 0. 5.8751317251297030e+02 0.
    3.0365382066513448e+03 4.6195508556785745e+02 0. 0. 1.</data></camera_matrix>
<distortion_coefficients type_id="opencv-matrix">
  <rows>5</rows>
  <cols>1</cols>
  <dt>d</dt>
  <data>
    -8.9847714399800147e-02 0. 0. 0. 0.</data></distortion_coefficients>
<avg_reprojection_error>1.5216911603787708e-01</avg_reprojection_error>

Our implementation with feature points detected with OpenCV:

<camera_matrix type_id="opencv-matrix">
  <rows>3</rows>
  <cols>3</cols>
  <dt>d</dt>
  <data>
    3.0557118442612523e+03 0. 5.8771714504295210e+02 0.
    3.0365422748810047e+03 4.6445492814159769e+02 0. 0. 1.</data></camera_matrix>
<distortion_coefficients type_id="opencv-matrix">
  <rows>5</rows>
  <cols>1</cols>
  <dt>d</dt>
  <data>
    -8.9434361134088730e-02 0. 0. 0. 0.</data></distortion_coefficients>
<avg_reprojection_error>1.5527737807096595e-01</avg_reprojection_error>

The minor differences between camera intrinsic parameters might be the effects of choosing 3D fix points of (0, 0, 0), (d, 0, 0) and (x3, y3, 0). In our implementation these three 3D points were selected as top-left, top-right and bottom-right points of the chessboard corners grid.

xoox commented 6 years ago

We investigated why there are obvious differences between our results and CalLab's results as shown below.

parameter CalLab calibrel
fx 3055.39 3055.58
fy 3036.52 3036.54
cx 582.201 587.513
cy 459.132 461.955
k1 -0.090087 -0.0898477
RMS 0.152238 0.152169

As theoretically expected, testing has proved that the selection of fixed 3D points don't affect the result.

If we feed CalLab's results into calibrel as initial values and fix them, then a similar RMS of 0.152226 will be gotten.

If cx = 587.513 and cy = 461.955 are fixed in CalLab with values from calibrel, CalLab will reach even smaller RMS of 0.152181 and give out the following parameters which are closer to calibrel's calculation.

parameter CalLab
fx 3055.57
fy 3036.54
k1 -0.0899015

We can get a conclusion that CalLab fails to reach better optimization results especially for cx and cy. The possible reasons are following.

But the above guesses can't explain CalLab gives same results as OpenCV when refining full object structure is not requested. Further detailed investigations are still needed.

CalLab is very sensitive to Levenberg-Marquardt parameters. Here is a comparison.

parameter FTOL=0.0001, RELSTEP=0.01 FTOL=0.00001, RELSTEP=0.045
fx 3055.39 3055.66
fy 3036.52 3036.42
cx 582.201 588.023
cy 459.132 461.368
k1 -0.090087 0.0899177
RMS 0.152238 0.152171
xoox commented 6 years ago

CalLab and calibrel were compared again with dataset1, dataset2 and dataset3 in the follow archive.

https://github.com/xoox/calibrel_testdata/archive/CalLab.zip

The three above dataset were sampled from the same dataset actually which was split into three parts. So camera parameters calculated from them should be consistent to each other. The feature points were detected with OpenCV and refined with cornerSubPix().

CalLab results

FTOL=0.00001. Other optimization parameters were set to default. RELSTEP=0.01

Parameter dataset1 dataset2 dataset3
RMS 0.0867361 0.0895668 0.0861593
fx 3040.94 3040.96 3040.30
fy 3041.17 3040.16 3038.64
cx 607.857 608.843 607.461
cy 536.170 537.148 537.625
k1 -0.0829592 -0.0822956 -0.0837340

For RELSTEP=0.04

Parameter dataset1 dataset2 dataset3
RMS 0.0867251 0.0895250 0.0861507
fx 3041.04 3040.99 3040.38
fy 3041.11 3040.19 3038.73
cx 608.782 607.429 607.820
cy 537.467 537.160 538.726
k1 -0.0829699 -0.0823135 -0.0837408

For RELSTEP=0.0025

Parameter dataset1 dataset2 dataset3
RMS 0.0869889 0.0896314 0.0864126
fx 3040.74 3040.98 3040.14
fy 3041.16 3040.17 3038.54
cx 609.737 609.443 612.475
cy 533.515 536.100 533.012
k1 -0.0828763 -0.0822277 -0.0837854

When RELSTEP=0.04 was used, CalLab's results were closer to calibrel's results.

calibrel results

Parameter dataset1 dataset2 dataset3
RMS 0.0867244 0.0895241 0.0861504
fx 3041.01665 3040.99082 3040.37711
fy 3041.10761 3040.19260 3038.72519
cx 608.810470 607.321286 607.935824
cy 537.219851 537.245663 538.764664
k1 -0.0829604 -0.0823206 -0.0837447

CalLab's sensitivity to RELSTEP

With dataset5.zip, FTOL=0.00001, GTOL=1e-6. For standard calibration.

Parameter RELSTEP=0.1 RELSTEP=0.01 RELSTEP=0.001
RMS 0.798080 0.798078 0.798078
fx 3089.99 3089.82 3089.81
fy 3100.11 3100.00 3099.99
cx 619.898 619.939 619.944
cy 516.445 516.496 516.499
k1 -0.0639205 -0.0639682 -0.0639729

The standard calibration method is stable with regard to RELSTEP.

With the same dataset5, if refining full object structure, the results will diverge from each other (especially cx and cy).

Parameter RELSTEP=0.1 RELSTEP=0.01 RELSTEP=0.001
RMS 0.152505 0.152237 0.156158
fx 3056.14 3055.39 3056.14
fy 3034.46 3036.52 3037.84
cx 600.033 582.201 611.249
cy 436.028 459.132 513.135
k1 -0.0918021 -0.0900870 -0.0866259

When the dimensionality of the parameter space is much higher, cx and cy are perhaps more sensitive to jacobian. This kind of diversity was also found upon dataset1, dataset2 and dataset3, but the differences were smaller. dataset5 might has not full coverage of field of view. And the plate used in dataset5 was not as rigid as the one in dataset1, dataset2 and dataset3.