alexklwong / calibrated-backprojection-network

PyTorch Implementation of Unsupervised Depth Completion with Calibrated Backprojection Layers (ORAL, ICCV 2021)
Other
117 stars 24 forks source link

camera intrinsic matrix #4

Closed zinuok closed 2 years ago

zinuok commented 2 years ago

Hello, I have a question about the values in "K.txt"

in original VOID dataset, the intrinsic parameters provided in here are:

"f_x": 514.638,
"f_y": 518.858,
"c_x": 315.267,
"c_y": 247.358,

However, in the "K.txt":

5.471833965147203571e+02 0.000000000000000000e+00 3.176305425559989430e+02
0.000000000000000000e+00 5.565094509450176474e+02 2.524727249693490592e+02
0.000000000000000000e+00 0.000000000000000000e+00 1.000000000000000000e+00```

, where K =
f_x   0    c_x
0     f_y  c_y
0      0      1
(as I know)

They are somewhat different.

Q1. Is the camera's distortion model (radtan) already applied in "K.txt" ?

Q2. And the second question is that, why the intrinsic parameters are different across the different sequence? Did you use different sensor setup in each sequence? (in your paper, it is written that D435i was used for data acquisition). If so, which intrinsic data should be used for real-usage, like VIO ?

Very thanks in advance

alexklwong commented 2 years ago

Ah, let me clarify. The distortion model should not be included in calibration. The intrinsic parameters I believe we got those off the factory calibration settings -- this was done afterwards since someone asked for it. For the actual dataset, we calibrated the sensor for each sequence, so the numbers are slightly different across all the sequences. You should use K.txt provided for each sequence when running it.

zinuok commented 2 years ago

Thank you very much for taking your valuable time to answer my questions. Sorry to bother you, but I have one more question,

You said that "The distortion model should not be included in calibration".

Looking through your code and paper, it seems that the distortion model is not taken into account. (only inverse intrinsics are considered when feature is back-projected into 3D space)

I wonder if the distortion model doesn't need to be taken into account in your network. (or did the images already rectified using the distortion coefficients..?)

Thank you very much

alexklwong commented 2 years ago

That's correct, its a rudimentary calibration model i.e. pinhole so we do not account for distortion in the paper. I don't recall going through a rectification process when we collected the dataset either. I think this works because there isn't anything noticeable in terms of lens distortion. On the otherhand, if you were to use a fisheye then yes you definitely need to undistort first. This was the case in a previous paper (https://arxiv.org/pdf/1905.08616.pdf) where we were tried out tum vi.

zinuok commented 2 years ago

Oh I see.. I'll refer the link. Thanks again !

alexklwong commented 2 years ago

In case you are interested: https://github.com/IntelRealSense/librealsense/issues/1430#issuecomment-378162424 for the RealSense d400 series, images would be off by 1 pixel at the extreme, so there would be little to no difference