facebookresearch / co3d

Tooling for the Common Objects In 3D dataset.
Other
946 stars 74 forks source link

Centering camera extrinsics to find absolute positions #64

Open Maro1 opened 1 year ago

Maro1 commented 1 year ago

It seems to me when visualizing some of the object's camera extrinsics that the origin is not at (0, 0, 0) and that the reference frame is offset (the circle of all the cameras is offset by an angle). I am trying to find the absolute position (and preferably rotation) of the cameras and am therefore wondering whether this information is contained within the dataset?

davnov134 commented 1 year ago

Hi, we zero-centered every point cloud and adjusted the camera extrinsics accordingly. Depending on the camera trajectory around the object, the centroid of the camera centers will usually be a bit above (0,0,0) since most turkers captured the objects from above the object. Finding a canonical rotation is very hard for most object categories. Regardless, we found out that many objects can be put "upright" by using the following function:

import torch
import copy
from typing import Tuple, Optional
from pytorch3d.transforms import so3_exp_map
from pytorch3d.structures import Pointclouds
from pytorch3d.renderer.cameras import CamerasBase

def adjust_scene_scale_up_vector(
    cameras: CamerasBase,
    pcl: Pointclouds,
    to_vec=(0.0, -1.0, 0.0),
    from_vec=(-0.0396, -0.8306, -0.5554),  # in most cases, corresponds to the ground plane normal in CO3Dv2 scenes
):
    """
    Rotates the up vector of input cameras and pointcloud to desired direction.
    """
    T_adjust = torch.zeros(3)

    rot_axis_angle = torch.cross(
        torch.FloatTensor(to_vec),
        torch.FloatTensor(from_vec),
    ).to(cameras.device)
    R_adjust = so3_exp_map(rot_axis_angle[None])[0]

    # adjust point cloud
    pcl = pcl.update_padded(pcl.points_padded() + T_adjust)    
    pcl = pcl.update_padded(rescale_factor * pcl.points_padded())
    pcl = pcl.update_padded(pcl.points_padded() @ R_adjust[None])

    # adjust cameras
    cameras_a = copy.deepcopy(cameras)
    align_t_R = R_adjust.t()
    align_t_T = -rescale_factor * T_adjust[None] @ align_t_R
    align_t_s = rescale_factor
    cameras_a.T = (
        torch.bmm(
            align_t_T[:, None].repeat(cameras_a.R.shape[0], 1, 1),
            cameras_a.R,
        )[:, 0]
        + cameras_a.T * align_t_s
    )
    cameras_a.R = torch.bmm(
        align_t_R[None].expand_as(cameras_a.R),
        cameras_a.R
    )

    return cameras_a, pcl
Mlosser commented 5 months ago

Hi, how to compute the rescale_factor?