facebookresearch / pytorch3d

PyTorch3D is FAIR's library of reusable components for deep learning with 3D data
https://pytorch3d.org/
Other
8.75k stars 1.31k forks source link

Wrong rendering result of point cloud #373

Closed BostonLobster closed 4 years ago

BostonLobster commented 4 years ago

If you do not know the root cause of the problem / bug, and wish someone to help you, please post according to this template:

🐛 Bugs / Unexpected behaviors

I'm trying to use the code from render_colored_point.ipynb to render a point cloud from Scan Net. My procedure is as follows:

  1. Read the ply point cloud file and convert to PointCloud instance.
  2. Read intrinsic and extrinsic.
  3. Create a PerspectiveCameras instance with intrinsics and extrinsics.
  4. Render the point cloud.
But the rendering result is very different from the ground truth view image, as shown below: Rendered GT
render_perspective_ori 0000

The rendered image is downloaded from jupyter, which is resized, the original size is (1296, 1296). The GT is of (968, 1296).

So you can see the camera is at wrong place! The correct camera pose is shooting to the two screens, but the rendered result is shooting above the room.

I know that the extrinsics of Scan Net is in OpenCV coordinate system, so I check information from https://github.com/facebookresearch/pytorch3d/blob/master/docs/notes/cameras.md and https://github.com/vvvv/VL.OpenCV/wiki/Coordinate-system-conversions-between-OpenCV,-DirectX-and-vvvv. I found that in OpenCV, the x-axis points right, y-axis points down, so I guess I just need to rotate them around z-axis by pi, so that it is aligned with pytorch3d? I tried to add minus to the first two column of R, but got a emtpy rendering result.

Instructions To Reproduce the Issue:

Please include the following (depending on what the issue is):

  1. following is my code
    
    import os
    import torch
    import torch.nn.functional as F
    import matplotlib.pyplot as plt
    from skimage.io import imread

Util function for loading point clouds

import numpy as np

Data structures and functions for rendering

from pytorch3d.structures import Pointclouds from pytorch3d.renderer import ( look_at_view_transform, OpenGLOrthographicCameras, PointsRasterizationSettings, PointsRenderer, PointsRasterizer, AlphaCompositor, NormWeightedCompositor, SfMPerspectiveCameras, SfMOrthographicCameras, OpenGLPerspectiveCameras ) import trimesh

Setup

if torch.cuda.is_available(): device = torch.device("cuda:0") torch.cuda.set_device(device) else: device = torch.device("cpu")

pc = trimesh.load('./scene0010_00/scene0010_00_vh_clean.ply')

Load point cloud

verts = torch.Tensor(pc.vertices).to(device)
rgb = torch.Tensor(pc.visual.vertex_colors[:, :3] / 255.).to(device) point_cloud = Pointclouds(points=[verts], features=[rgb])

to_tensor = lambda x: [torch.Tensor(i).unsqueeze(0) for i in x]

Load camera parameters

pose_path = './scene0010_00/pose/0000.txt' intrinsic_path = './scene0010_00/intrinsic/intrinsic_color.txt' extrinsic = np.loadtxt(pose_path)

R, T = to_tensor([extrinsic[:3, :3], extrinsic[:3, -1]]) K = torch.Tensor(np.loadtxt(intrinsic_path)).unsqueeze(0)

image_size = 1296

f_screen = torch.stack([K[:, 0, 0], K[:, 1, 1]], dim=1) p_screen = torch.stack([K[:, 0, 2], K[:, 1, 2]], dim=1)

f_ndc = f_screen 2.0 / image_size p_ndc = - (p_screen - image_size / 2.0) 2.0 / image_size

cameras = SfMPerspectiveCameras(focal_length=f_ndc, principal_point=p_ndc, R=R, T=T, device=device)

Define the settings for rasterization and shading. Here we set the output image to be of size

512x512. As we are rendering images for visualization purposes only we will set faces_per_pixel=1

and blur_radius=0.0. Refer to raster_points.py for explanations of these parameters.

raster_settings = PointsRasterizationSettings( image_size=image_size, radius = 0.003, points_per_pixel = 10, bin_size=100 )

Create a points renderer by compositing points using an alpha compositor (nearer points

are weighted more heavily). See [1] for an explanation.

renderer = PointsRenderer( rasterizer=PointsRasterizer(cameras=cameras, raster_settings=raster_settings), compositor=AlphaCompositor() )

images = renderer(point_cloud) plt.figure(figsize=(10, 10)) plt.imshow(images[0, ..., :3].cpu().numpy()) plt.grid("off") plt.axis("off");

gkioxari commented 4 years ago

Hi @BostonLobster I think this is an issue of coordinate systems. You need to figure out the world coordinate system of the world system and also the R, T given by ScanNet. I haven't worked with ScanNet a lot but I think that their R, T follows a different convention.

BostonLobster commented 4 years ago

@gkioxari Thanks for your reply. However, as far as I know, the R, T of ScanNet is in OpenCV coordinate system, which is following figure image

and pytorch3d is using the coordinate system below image

So, by rotating the X-Y plane around z-axis by pi in OpenCV coordinate system, we get pytorch3d coordinate. Anything wrong?

BostonLobster commented 4 years ago

My above understanding is correct, by rotating X-Y plane around z-axis we can get the right coordinate. The wrong rendering result comes from other mistake made elsewhere.

ZX-Yin commented 3 years ago

My above understanding is correct, by rotating X-Y plane around z-axis we can get the right coordinate. The wrong rendering result comes from other mistake made elsewhere.

Hi, have you solved the problem? I've been stuck in this problem for a long time.

Minisal commented 2 years ago

@BostonLobster @JasonYinn

Hello, Did any of you solve this problem? I try to rotating X-Y plane around z-axis, but still don't get the correct rendering result.