facebookresearch / pytorch3d

PyTorch3D is FAIR's library of reusable components for deep learning with 3D data
https://pytorch3d.org/
Other
8.7k stars 1.3k forks source link

mesh seems to be right but rendering results have poor noise. #1628

Closed seemaywang closed 3 months ago

seemaywang commented 1 year ago

I use pytorch3d to construct mesh from a RGB color and a depth image. The mesh construncted seems to be right, but here are the problems.

  1. when I give a target camera extrinsic as an identity matrix, the rendering image has too much noise but when target camera extrinsic has a slight offset from identity matrix, the noise would be weaken but still exists.

  2. the rendering image lost rightmost column of pixels and lowest row of pixels.

Here is my code of rendering: ` def cameras_from_opencv_projection_no_ndc(R, tvec, camera_matrix, image_size, device):

focal_length = torch.stack([camera_matrix[:, 0, 0], camera_matrix[:, 1, 1]], dim=-1)

principal_point = camera_matrix[:, :2, 2]

R_pytorch3d = R.clone().permute(0, 2, 1)
T_pytorch3d = tvec.clone()
R_pytorch3d[:, :, :2] *= -1
T_pytorch3d[:, :2] *= -1
return PerspectiveCameras(focal_length=focal_length,
                          principal_point=principal_point,
                          R=R_pytorch3d,
                          T=T_pytorch3d,
                          device=device,
                          in_ndc=False, 
                          image_size=image_size)` 

`def cameras_from_opencv_projection_no_ndc(R, tvec, camera_matrix, image_size, device):

focal_length = torch.stack([camera_matrix[:, 0, 0], camera_matrix[:, 1, 1]], dim=-1)

principal_point = camera_matrix[:, :2, 2]

R_pytorch3d = R.clone().permute(0, 2, 1)
T_pytorch3d = tvec.clone()
R_pytorch3d[:, :, :2] *= -1
T_pytorch3d[:, :2] *= -1
return PerspectiveCameras(focal_length=focal_length,
                          principal_point=principal_point,
                          R=R_pytorch3d,
                          T=T_pytorch3d,
                          device=device,
                          in_ndc=False,
                          image_size=image_size)`

`def _render_one_mesh(self, mesh, tgt_pose_w2c_N44, tgt_k_N33, height, width, only_zbuf=False, rend_mask=False):

    if len(tgt_pose_w2c_N44.shape) == 2:
        tgt_pose_w2c_N44 = tgt_pose_w2c_N44.unsqueeze(0)
    if len(tgt_k_N33.shape) == 2:
        tgt_k_N33 = tgt_k_N33.unsqueeze(0)
    R, T = tgt_pose_w2c_N44[:, :3, :3], tgt_pose_w2c_N44[:, :3, 3]
    self._rend_mask = rend_mask
    cameras = cameras_from_opencv_projection_no_ndc(R, T, tgt_k_N33, torch.FloatTensor([[width, height]]).to(self._device), self._device)
    raster_setting = RasterizationSettings(image_size=[width, height])
    rasterizer = MeshRasterizer(cameras=cameras, raster_settings=raster_setting)
    renderer = MeshRenderer(rasterizer=rasterizer, shader=MyShader(device=self._device, cameras=cameras, only_zbuf=only_zbuf))
    image = renderer(mesh)
    return image`

Rendering result save as follows: tgt_img = np.clip(image[0][...,:3].cpu().numpy() * 255, 0, 255) cv2.imwrite("img.png", tgt_img.astype(np.uint8))

Here is my mesh and texture: image debug Here is when target camera extrinsic equals indentify matrix, rendering result: res_0 0_00000 You can see promblem 1and 2 I described above. when target camera extrinsic equals tgt_pose = np.array([[1.0, 0.0, 0.0, 0.005], [0.0, 1.0, 0.0, 0.0], [0.0, 0.0, 1.0, 0.0], [0.0, 0.0, 0.0, 1.0]], dtype=float), redering result is as follows: image noise is weaken but still exsits if you watch carefully.

So does anyone know how to solve the problem. Thanks a lot!!!!!!!

kylesargent commented 7 months ago

I also noticed this issue.

bottler commented 6 months ago

Is there another piece of the mesh very close to the camera (like the other side of the surrounding buildings)?. Maybe it needs to be clipped out.