facebookresearch / pytorch3d

PyTorch3D is FAIR's library of reusable components for deep learning with 3D data
https://pytorch3d.org/
Other
8.83k stars 1.32k forks source link

PerspectiveCameras issue with PulsarPointsRenderer #1352

Open maximeraafat opened 2 years ago

maximeraafat commented 2 years ago

Hi, thank you for this amazing work!

I've been running into some issues with pulsar rendering, and would appreciate your help. I am loading some parameters from calibrated cameras into a PerspectiveCameras object, and render the scene with both the default PointsRenderer and the PulsarPointsRenderer classes. The default renderer works perfectly and the image is calibrated as expected, but I fail to do the same with pulsar, and my rendered scene is empty.

A previous issue seem to address this problem (#772), but doesn't explain how to handle the focal length and principal points in the absence of K. I've tried to convert them into a 4x4 intrinsics matrix K (and then follow the advise in #772), but this didn't resolve the issue.

In another attend, I used the pytorch3d.utils.pulsar_from_cameras_projection function as suggested in #590 and #734, but there doesn't seem to be a direct way to provide the converted parameters to an object of the PulsarPointsRenderer class. The conversion function outputs translation, rotation, focal length, sensor width, and principal point parameters in pulsar convention, but the PulsarPointsRenderer forward doesn't use a sensor width as input. Even when omitting sensor width in the forward, pulsar still doesn't render the scene from the desired view. Could you please explain how to use the mentioned camera parameters with pulsar?

Below is a code example in which I first initialise the default point and pulsar renders, then render the scene with the default renderer (no problem here), and finally experiment with different scenarios for pulsar, which all resulted in a fully black and empty render.

# modules
import torch
from pytorch3d.structures import Pointclouds
from pytorch3d.renderer import PerspectiveCameras
from pytorch3d.transforms import rotation_6d_to_matrix
from pytorch3d.utils.camera_conversions import pulsar_from_cameras_projection
from pytorch3d.renderer import (
    PointsRasterizationSettings,
    PointsRasterizer,
    PointsRenderer,
    PulsarPointsRenderer,    
    AlphaCompositor
)

# default synsin point renderer
def point_renderer(cameras, image_size, radius=0.005, points_per_pixel=50):
    raster_settings = PointsRasterizationSettings(
        image_size=image_size,
        radius=radius,
        points_per_pixel=points_per_pixel
    )

    rasterizer = PointsRasterizer(cameras=cameras, raster_settings=raster_settings)
    renderer = PointsRenderer(rasterizer=rasterizer, compositor=AlphaCompositor())

    return renderer

# pulsar renderer
def pulsar_renderer(cameras, image_size, radius=0.005, points_per_pixel=10):
    raster_settings = PointsRasterizationSettings(
        image_size=image_size,
        radius=radius,
        points_per_pixel=points_per_pixel
    )

    rasterizer = PointsRasterizer(cameras=cameras, raster_settings=raster_settings)
    renderer = PulsarPointsRenderer(rasterizer=rasterizer)

    return renderer

# given are R, T, focal_length and principal_point parameters

image_size = torch.Tensor( [[1080, 1920]] )
resolution = (540, 960) # also tried with (1080, 1920)

# perspectivce camera for default renderer
cam_default = PerspectiveCameras(focal_length=focal_length, R=R, T=T, principal_point=principal_point, in_ndc=False, image_size=image_size)

# camera conversion
pulsar_cam = pulsar_from_cameras_projection(cam_default, image_size)
pulsarT = pulsar_cam[:, :3]
pulsarR = rotation_6d_to_matrix( pulsar_cam[:, 3:9] )
pulsar_focal_length = pulsar_cam[:, 9]
pulsar_sensor_width = pulsar_cam[:, 10]
pulsar_principal_point = pulsar_cam[:, 11:]

# perspective camera attempts for pulsar renderer
cam_pulsar1 = PerspectiveCameras()
cam_pulsar2 = PerspectiveCameras(focal_length=pulsar_focal_length, R=pulsarR, T=pulsarT, principal_point=pulsar_principal_point, in_ndc=False, image_size=image_size)
cam_pulsar3 = PerspectiveCameras(focal_length=pulsar_focal_length, R=pulsarR, T=pulsarT, principal_point=pulsar_principal_point, in_ndc=True, image_size=image_size)

# xyz and rgb are given position and color tensors
pcl = Pointclouds(points=[xyz], features=[rgb])

# default rendering : no problems, image properly calibrated
default = point_renderer(cam_default, resolution, radius=0.1)
img_default = default(pcl)[0]

# pulsar renderer attempts : problems
pulsar1 = pulsar_renderer(cam_pulsar1, resolution, radius=0.1)
pulsar2 = pulsar_renderer(cam_pulsar2, resolution, radius=0.1)
pulsar3 = pulsar_renderer(cam_pulsar3, resolution, radius=0.1)

# pulsar rendering attempts with pulsar1 renderer
img_pulsar1 = pulsar1(pcl, focal_length=pulsar_focal_length, R=pulsarR, T=pulsarT, principal_point=pulsar_principal_point, gamma=(1e-5,), znear=(0.1,), zfar=(100.0,))[0]

# pulsar rendering attempts with pulsar2 renderer
img_pulsar2 = pulsar2(pcl, focal_length=pulsar_focal_length, gamma=(1e-5,), znear=(0.1,), zfar=(100.0,))[0] # if focal length is not passed, pulsar will complain about single focal length needed
img_pulsar3 = pulsar2(pcl, focal_length=pulsar_focal_length, R=pulsarR, T=pulsarT, principal_point=pulsar_principal_point, gamma=(1e-5,), znear=(0.1,), zfar=(100.0,))[0]

# pulsar rendering attempts with pulsar3 renderer
img_pulsar4 = pulsar3(pcl, focal_length=pulsar_focal_length, gamma=(1e-5,), znear=(0.1,), zfar=(100.0,))[0] # if focal length is not passed, pulsar will complain about single focal length needed
img_pulsar5 = pulsar3(pcl, focal_length=pulsar_focal_length, R=pulsarR, T=pulsarT, principal_point=pulsar_principal_point, gamma=(1e-5,), znear=(0.1,), zfar=(100.0,))[0]

Thank you in advance!

maximeraafat commented 2 years ago

Quick update: I had a look at https://github.com/facebookresearch/pytorch3d/blob/main/pytorch3d/renderer/points/pulsar/unified.py, and it seems that the PulsarPointsRenderer class already handles the camera conversion to the pulsar convention (the functions _extract_intrinsics and _extract_extrinsics do this). One should therefore be able to simply pass a PerspectiveCameras object to pulsar without any conversion beforehand. But when I do something as in the code below, I still get an empty render..

# perspectivce camera for default renderer
cam_default = PerspectiveCameras(focal_length=focal_length, R=R, T=T, principal_point=principal_point, in_ndc=False, image_size=image_size)

pulsar = pulsar_renderer(cam_default, resolution, radius=0.1)
img_pulsar = pulsar(pcl, gamma=(1e-5,), znear=(0.1,), zfar=(100.0,))[0]

# no need to pass a focal length to pulsar() this time, as the one in cam_default satisfies the condition in https://github.com/facebookresearch/pytorch3d/blob/main/pytorch3d/renderer/points/pulsar/unified.py#L289

Attached (cow.zip) is a .ply point cloud file of the pytorch3d cow mesh, and below is a code to reproduce a simple example, with the point_renderer and pulsar_renderer classes defined above.

# modules
from pytorch3d.io import IO
from pytorch3d.renderer import PerspectiveCameras
from pytorch3d.renderer import look_at_view_transform

# load cow pointcloud
pcl = IO().load_pointcloud('cow.ply')

# setup toy camera
R, T = look_at_view_transform(dist=3, elev=0, azim=140)
focal_length = ((-1000, 1000),)
principal_point = ((800, 600),)
image_size = ((1080, 1920),)
cam =  PerspectiveCameras(focal_length=focal_length, R=R, T=T, principal_point=principal_point, in_ndc=False, image_size=image_size)

# renderers
default = point_renderer(cam, (1080, 1920), radius=0.01)
pulsar = pulsar_renderer(cam, (1080, 1920), radius=0.01)

# rendering
img_default = default(pcl)[0]
img_pulsar = pulsar(pcl, gamma=(1e-5,), znear=(0.1,), zfar=(100.0,))[0]

img_default (first image) and img_pulsar (second image) are shown below, and as described, the pulsar render is empty.

img_default img_pulsar

classner commented 2 years ago

Hi @maximeraafat !

Thanks for reaching out and sorry for the delay in responding to this issue! Indeed, the PulsarPointsRenderer handles the conversion of camera parameters directly - however, I think there is a problem with the principal point conversion which might lead to different results for the two renderers. I think the PulsarPointsRenderer handles the principal point as originally documented (as offset in NDC coordinates from the center of the image - the center is (0, 0) ) but the other one doesn't. Can you try using (0,0) as principal point, or convert to offset in NDC and see whether this resolves this problem?

maximeraafat commented 2 years ago

Hi @classner, thank you for your helpful reply!

Unfortunately, your suggestion doesn't fully solve the issue, but I think this is the right direction. I heuristically noticed that changing from (or to) NDC coordinates doesn't affect Pulsar rendering at all (or at least not in the cow scenario shared in the code above). However, changing principle point seems to help. When I set the center to principal_point = ((0,0),), I get the following Pulsar render:

principal_point=(0,0)

I realised that the issue now is probably the focal length, so I first changed it to focal_length = ((-1,1),) (normalising both fx and fy with respect to themselves), and got the following:

focal_length=(-1,1)

Alright, so we got already the scale right, but two issues are remaining : the orientation and the center. After some trials, I noticed that setting a positive focal length parameter fx leeds to value errors (for instance, when setting focal_length = ((1,1),), I get ValueError: Pulsar only supports a single focal length! Provided: tensor([1., 1.]).). But passing a focal length as a single element tensor along gamma, znear and zfar in the rendering function works fine. Setting focal_length = torch.Tensor([1])in the rendering function is enough to flip the orientation. However, the image is still flipped horizontally, as can be seen below:

focal_length=1tensor

I tried to play a bit with the rotation parameters, but I don't think this is the right way to go... Finally, after lots of experiments, I found that setting the center as principal_point = ((1.0 - 800/1080, 0.5 - 600/1920),) results in the best alignment with the original point rendering (see the Pulsar rendering below). Remember that my principle point and image size were indeed principal_point = ((800, 600),) and image_size = ((1080, 1920),). I am not exactly sure why this center location works fine, but here are my thoughts. My initial thought was to set the center to (800/1080, 600/1920), simply as a normalised version of the original principle point. However as far as I understand, since we now play in NDC, I believe the coordinates' origins are shifted to (1, 1). Therefore, since coordinates are now ranged from -1 and 1 (or perhaps 0 to 1?), it makes sense to subtract these normalised values respectively, in order to stay in range. Subtracting fy from 1 however didn't yield nice alignments, and I cannot explain why 0.5 works better (perhaps some strange symmetry that wouldn't occur with the correct flipped render).

focal_length=1tensor_(1-800:1080, 0 5-600:1920)

Again as a reference, below you can find the default point render we aim to obtain:

default

Any additional comments or suggestions are very welcome. In the meanwhile I will continue testing different scenarios, perhaps also with other perspective cameras, to see whether different settings might not encounter such issues.

classner commented 2 years ago

Hi @maximeraafat ,

Glad we tracked down the reason (I opened a bug report earlier this year here btw.: https://github.com/facebookresearch/pytorch3d/issues/1276 )!

Two things: 1) it is expected that you mirror the image (the unified interface does that for you here: https://github.com/facebookresearch/pytorch3d/blob/1b0584f7bd2bbf0d6a2e5563a8c530d62f2338ba/pytorch3d/renderer/points/pulsar/unified.py#L552 automatically). The requirement to mirror is an artifact of different conventions in the PyTorch3D camera parameterization and Pulsar parameterization (because of handedness), so no rotation can account for that in principle. 2) As we saw now, the principal point is specified in PyTorch3D in NDC coordinates in range [-1, 1], where (0, 0) is in the image center. The Pulsar renderer itself (https://github.com/facebookresearch/pytorch3d/blob/1b0584f7bd2bbf0d6a2e5563a8c530d62f2338ba/pytorch3d/renderer/points/pulsar/renderer.py#L574 ) uses the principal point specification as offset in pixels from the image center (0, 0). The code in the unified interface tries to do the conversion between the two conventions here: https://github.com/facebookresearch/pytorch3d/blob/1b0584f7bd2bbf0d6a2e5563a8c530d62f2338ba/pytorch3d/renderer/points/pulsar/unified.py#L249 (mapping [-1, 1] to [-0.5, 0.5], then multiplying it with width / height respectively to get the pixel value). But something must have changed in how PyTorch3D mangles the value beforehand so this is not correct anymore. If you are investigating, I would try to pdb into that, and make sure that the value that goes into the Pulsar renderer corresponds to what you'd want.

maximeraafat commented 2 years ago

Hi @classner, thanks for your feedback again!

As per your suggestion, I dived deeper into the unified interface and did investigate the values passed to the Pulsar renderer via pdb. It turned out that my previous experiments were only getting better by coincidence, and slightly changing focal length or principle points again renders two completely different things for the default point renderer and Pulsar.

I noticed that Pulsar actually doesn't consider cameras in screen space, and always assumes cameras in NDC space. I therefore tested a very simple camera in NDC space, to see whether the default point renderer and Pulsar are capable of rendering the same scene there, but even this fails...

My example is the following (everything is the same as above, except for the camera, and a radius of 0.001 passed to pulsar_renderer), and produces the images below:

R, T = look_at_view_transform(dist=2, elev=0, azim=140)
focal_length = ((-1, 1),)
cam =  PerspectiveCameras(focal_length=focal_length, R=R, T=T)

Default point render: default22

and Pulsar render: pulsar22

Even by flipping the render along the first axis and adapting the principle points manually, the focal length still does not seem to be right, and manually adjusting it for every camera for the closest match possible is obviously not the right way to go.

If you have any further feedback or suggestions, every idea is welcome! 🙂

classner commented 2 years ago

Hi @maximeraafat ,

I just looked at this with https://github.com/facebookresearch/pytorch3d/pull/1369 . You can directly clone from there and everything works as expected for all cameras and in NDC or non-NDC format. This was due to changes in the PyTorch3D cameras that were not propagated to the unified interface.

Test case (uncomment different camera types):

#!/usr/bin/env python3
"""
Unified interface test.
"""
import torch
import cv2  # Quick debugging visualizations.
from pytorch3d.renderer import (
    PointsRasterizationSettings,
    PointsRasterizer,
    PointsRenderer,
    PulsarPointsRenderer,
    AlphaCompositor,
)
from pytorch3d.io import IO
from pytorch3d.renderer import (  # noqa: F401
    PerspectiveCameras,
    OrthographicCameras,
    FoVOrthographicCameras,
    FoVPerspectiveCameras,
)
from pytorch3d.renderer import look_at_view_transform

in_ndc = True  # Provide arguments in NDC or non-NDC format.
size_fac = 4.0  # Render at this fraction of original image size for fast debugging.

# load cow pointcloud
pcl = IO().load_pointcloud("cow.ply")
# setup toy camera
R, T = look_at_view_transform(dist=3, elev=40, azim=140)
# generate some configurations.
image_size = ((int(1080.0 / size_fac), int(1920.0 / size_fac)),)
if not in_ndc:
    principal_point = (
        ((1920.0 / 2.0 + 200.0) / size_fac, (1080.0 / 2.0 + 100) / size_fac),
    )
    focal_length = ((1000.0 / size_fac, 1000.0 / size_fac),)
else:
    principal_point = ((0.3, 0.2),)
    focal_length = (
        (
            1000.0 / image_size[0][0] / 2.0,
            1000.0 / image_size[0][0] / 2.0,
        ),
    )

resolution = image_size[0]
# test different cameras.
cam = PerspectiveCameras(
    focal_length=focal_length,
    R=R,
    T=T,
    principal_point=principal_point,
    in_ndc=in_ndc,
    image_size=image_size,
)
# cam = OrthographicCameras(
#     focal_length=focal_length,
#     R=R,
#     T=T,
#     principal_point=principal_point,
#     in_ndc=in_ndc,
#     image_size=image_size,
# )
# cam = FoVOrthographicCameras(
#     R=R,
#     T=T,
# )
# cam = FoVPerspectiveCameras(fov=60.0, R=R, T=T, in_ndc=in_ndc)

def point_renderer(cameras, image_size, radius=0.005, points_per_pixel=50):
    raster_settings = PointsRasterizationSettings(
        image_size=image_size, radius=radius, points_per_pixel=points_per_pixel
    )

    rasterizer = PointsRasterizer(cameras=cameras, raster_settings=raster_settings)
    renderer = PointsRenderer(rasterizer=rasterizer, compositor=AlphaCompositor())

    return renderer

# pulsar renderer
def pulsar_renderer(cameras, image_size, radius=0.005, points_per_pixel=10):
    raster_settings = PointsRasterizationSettings(
        image_size=image_size, radius=radius, points_per_pixel=points_per_pixel
    )

    rasterizer = PointsRasterizer(cameras=cameras, raster_settings=raster_settings)
    renderer = PulsarPointsRenderer(rasterizer=rasterizer)

    return renderer

# renderers
default = point_renderer(cam, resolution, radius=0.01)
pulsar = pulsar_renderer(cam, resolution, radius=0.01)

# rendering
img_default = default(pcl)[0]
cv2.imshow("default", (img_default * 255.0).cpu().to(torch.uint8).numpy()[:, :, ::-1])
img_pulsar = pulsar(pcl, gamma=(1e-5,), znear=(0.1,), zfar=(100.0,))[0]
cv2.imshow("pulsar", (img_pulsar * 255.0).cpu().to(torch.uint8).numpy()[:, :, ::-1])
img_overlay = img_default
img_overlay[:, :, 0] = img_pulsar[:, :, 0]
cv2.imshow(
    "overlay",
    (img_overlay * 255.0).cpu().to(torch.uint8).numpy()[:, :, ::-1],
)
cv2.waitKey(0)

@bottler , @gkioxari : would be great if we could introduce another test case for the unified interface. I used to have one in the past (comparing results with ground truth expected results, just placing some points on a sphere; that made debugging of point size quite easy), but it must have been removed at some point. Happy to have fixed it this time, but it would be great if breaking changes could be detected by such a test case and then be fixed by the change author. Otherwise there's always a bunch of reverse engineering...

classner commented 2 years ago

@maximeraafat side note: in case you are experimenting for benchmarking using the direct interface saves a bunch of conversions from the unified interface and is significantly faster. If you are looking at a fixed test case it's easy to just grab the parameters for Pulsar from the unified interface here: https://github.com/facebookresearch/pytorch3d/blob/1b0584f7bd2bbf0d6a2e5563a8c530d62f2338ba/pytorch3d/renderer/points/pulsar/unified.py#L543 and use them directly. Lots of the things done in the unified interface are only done to produce results 'as-equivalent-as-possible' (for example, the scaling of points to match sizes in screen size) which may or may not be desired...

maximeraafat commented 2 years ago

Hi @classner, thank you so much for the fix! I didn't get to experiment yet with the updated unified interface, but if I encounter more issues I will let you know.

Also, thank you for the very helpful tip on using the direct interface instead of the unified interface, I will definitely make use of it!

TheNeeloy commented 5 months ago

Hi! I was trying out the test provided above for the PerspectiveCamera definition, and comparing the renderings between PyTorch3D and Pulsar renderers. I'm finding that the code throws an error ValueError: Pulsar only supports a single focal length! Provided: tensor([1.8519, 1.8519], device='cuda:0').. When I comment out that check in the source code in /pytorch3d/renderer/points/pulsar/unified.py, the rendered cows are still not aligning. I tried this code in PyTorch3D installation versions 0.7.5, 0.7.2, and 0.7.1, and they all produced the same scaled renderings. Is this the expected output using Pulsar? What changes would be required to make the renderings the same across PyTorch3D and Pulsar? Thanks for your time! Screenshot from 2024-06-10 10-32-37

TheNeeloy commented 5 months ago

Actually I just pulled the pull request https://github.com/facebookresearch/pytorch3d/pull/1369 and installed from source, the code above works as expected. Is the the pull request expected to the merged into future releases? image

bottler commented 2 months ago

The pull request isn't quite right - EG is asymmetrical with respect to horizontal and vertical, and breaks tests in test_camera_pixels which are actually correct. It needs fixing from someone before it can be merged.