owl-project / NVISII

Apache License 2.0
319 stars 27 forks source link

`get_intrinsics_matrix()` returns incorrect intrinsics matrix. #135

Open manuelli opened 2 years ago

manuelli commented 2 years ago

It seems that if get_intrinsics_matrix() always sets cx, cy to be the image center, not the actual values that we passed in. To test this I used the following script.

import nvisii

# Create a camera
nvisii.initialize(headless=True)
fx = 450
fy = 450
cx = 325
cy = 245
width = 640
height = 480
cam_component = nvisii.camera.create_from_intrinsics("camera", fx, fy, cx, cy, width, height)
camera = nvisii.entity.create("camera", camera=cam_component)
nvisii.set_camera_entity(camera)
intrinsics = nvisii.entity.get("camera").get_camera().get_intrinsic_matrix(width, height)
print("intrinsics", intrinsics)

nvisii.deinitialize()

Here cx=325 but when querying the intrinsics matrix it saying it is 320. From looking at rendered images it does seem that the passed in cx,cy are being used when rendering images, just not in the intrinsics matrix that gets returned.

intrinsics mat3x3((450.000000, 0.000000, 0.000000), (0.000000, 450.000000, 0.000000), (320.000000, 240.000000, 1.000000))
natevm commented 2 years ago

@TontonTremblay If I remember right, we just recycled code taken from Open3D. Is that correct? Do you know why there appear to be differences between these implementations?

@manuelli Just for context, we never really use intrinsics matrices in the rendering world, hense why this code isn't very well validated. Intrinsics matrices assume a pinhole camera model, which is not physically based (unless you're modeling a camera obscura, haha). It goes somewhat against the philosophy of "physically based everything".

In rasterization, we more commonly use projection matrices, but even then, in ray tracing these projection matrices only loosely guide ray direction. Depth of field settings, antialiasing jitter, and motion blur of the camera also factor into the outgoing ray directions and corresponding perceived color values.

For extrinsics, we instead typically instead use an affine matrix, typically called the "model" matrix. In nvisii, we call these matrices "local_to_world" matrices, which can be obtained from the transform component.

natevm commented 2 years ago

@manuelli could I get you to validate the code here?

This is the code we use to transform an intrinsics matrix into a projection matrix: https://github.com/owl-project/NVISII/blob/09bea3d3b68b3ec70a7cb5e2996af964c261efb5/src/nvisii/camera.cpp#L456

Then here is the code we use to reverse that transformation: https://github.com/owl-project/NVISII/blob/09bea3d3b68b3ec70a7cb5e2996af964c261efb5/src/nvisii/camera.cpp#L424

TontonTremblay commented 2 years ago

When I tested the set_intrinsics and get_intrinsics I did not really fiddle with the values, I kept everything centered (cx=width/2). We did take some open source code, but I cannot find the code :(.

@manuelli you can always set the projection matrix yourself, https://nvisii.com/camera.html?highlight=set_projection#nvisii.camera.set_projection probably the easiest way around this limitation.

manuelli commented 2 years ago

Ok thanks for the pointers I will take a look. Yeah the bigger issue on my end is that get_intrinsics_matrix() returns incorrect values and since that matrix is important for downstream robotics/computer-vision use-cases having that not be reliable can cause subtle but significant bugs down the line.

For reference this OpenCV reference always my go-to for understanding the intrinsics matrix. @natevm do you have a pointer to what the definition of the projection matrix is in the graphics/nvisii world?

natevm commented 2 years ago

@manuelli here’s an explanation I found on the web that’s pretty good : https://www.scratchapixel.com/lessons/3d-basic-rendering/perspective-and-orthographic-projection-matrix/opengl-perspective-projection-matrix

Also this https://web.cs.wpi.edu/~emmanuel/courses/cs543/f13/slides/lecture05_p2.pdf

And if those are confusing at all, there’s lots of additional material on the web for projection matrices. They’re often required for graphics api like OpenGL and DirectX as part of rasterization.

phquentin commented 1 year ago

@manuelli I have seen in issue Understanding the depth returned by render_data #125 that you are using the camera intrinsics to compute the depth image from the distance image. Have you any new insights on that regarding the problem with the intrinsics? Do you still you use these functions to compute the depth image and if so, which intrinsics do you use?