facebookresearch / pytorch3d

PyTorch3D is FAIR's library of reusable components for deep learning with 3D data
https://pytorch3d.org/
Other
8.75k stars 1.31k forks source link

Inconsistency between the docs, transform_points and transform_points_screen on screen-defined cameras #1547

Open jonilaserson opened 1 year ago

jonilaserson commented 1 year ago

The cameras doc says that:

  1. In world coordinates the positive x is going left.
  2. At screen space the top-left point is (0, 0) and the positive x goes to the right.
  3. If the Camera is defined in screen space (in_ndc=False), then transform_points projects the world to screen space.
  4. transform_points_screen should also project the world to screen space.

Hence, the world point [-1, 0, 1] is to the right of point [0, 0, 1] in world coordinates, and should have a higher x value in screen space.

However in the output of the code below, the opposite is true, which is not the expected behavior.

cam = PerspectiveCameras(
    focal_length=torch.tensor([[100.0, 100.0]]),
    principal_point=torch.tensor([[250.0, 250.0]]),
    image_size=((H, W),),
    in_ndc=False
)
points = torch.tensor([[[-1, 0, 1], [0, 0, 1]]], dtype=torch.float32)
cam.transform_points(points)

Output:

tensor([[[150., 250.,   1.],
         [250., 250.,   1.]]])

On the other hand, using cam.transform_points_screen(points) in the last line returns the expected output:

tensor([[[349.9999, 250.0000,   1.0000],
         [249.9999, 250.0000,   1.0000]]])

However, it's transform_points with its unexpected behavior that is used in all the main project and unproject methods. Is there a good reason why transform_points on screen cameras doesn't behave like the default transform_points_screen and according to the docs?

orweiser-code commented 1 year ago

I think this is related to this issue: https://github.com/facebookresearch/pytorch3d/issues/1436 from a few months ago. Didn't get a response there yet either.

bottler commented 8 months ago

Yes, I transform_points_screen is known to behave differently for different camera types, and I don't think it's going to be fixed because fixing it could be quite disruptive.