Closed leiwang1023 closed 5 months ago
Thank you for pointing this out. There is a typo in the paper about H
and W
in f_x
and f_y
, it should be max(H,W)
.
Anyway, let's take only the first line, corresponding to the focal length on x dimension for simplicity, the other lines are the same but for different camera parameters. The first line can be rewritten as f_x = max(H, W) / 2 * Delta f_x
.
The lack of clarity may stem from overwriting intrinsics
variable. The camera_layer
output, i.e. the first instrinsics
, contains the "delta" camera parameters, namely the so-called multiplicative residual Delta f_x
,Delta f_x
,Delta c_x
,Delta c_y
, in the appropriate locations (i.e., [0,0], [1,1], [0,2], [1,2])l. Each location is overwritten to obtain the parameters of the proper projection matrix from the multiplicative residual of the camera_layer
, i.e. from Delta f_x
to f_x
and so on.
Let me know if this clarifies more or if there is still something to be clarified
Hi. According to the description of the paper, the camera parameters predicted by the network and then use the pinhole imaging model to obtain the focal length and the main optical axis coordinates by the following formula:
but in you code:
I am confused what is the difference and connection between these two.
thanks a lot!