princeton-vl / DROID-SLAM

BSD 3-Clause "New" or "Revised" License
1.66k stars 273 forks source link

About traj_est and poses #77

Closed jugwangjin closed 1 year ago

jugwangjin commented 1 year ago

Hi, thank you for amazing work.

I'm trying to use this framework on novel view synthesis task. Sinse my view synthesis framework is based on depth.

I'm trying to get depth map using point cloud from reconstruction_path and camera poses poses.npy.

To test the depth acquisition pipeline, I'm testing reprojecting point cloud using poses.npy.

While I'm doing it, I found that some columns of disps.npy and traj_est (saved as .npy file) have opposite sign.

what is the difference of traj_est and poses?

Wjt-shift commented 1 year ago

I have same problem,Have you solved it ?

jugwangjin commented 1 year ago

I have same problem,Have you solved it ?

Hi, It's a bit late, but FYI, I kind of solved it.

What I wanted is rendering depth map. What we have is point cloud and camera trajectory, from DROID-SLAM.

To render depth maps using pyrender,
traj_est.npy is what I needed, not poses.npy

For pyrender,

intrinsics = np.load(str(in_dir / "intrinsics.npy")) # downscaled 
tstamps = np.load(str(in_dir / "tstamps.npy")).astype(np.int32)
traj_est = np.load(str(in_dir / "traj_est.npy"))
cam_poses = traj_est[tstamps]

# intrinsics
Ks = np.zeros((cam_poses.shape[0], 3, 3))   
Ks[:, 0, 0] = intrinsics[0, 0] * 8.0
Ks[:, 1, 1] = intrinsics[0, 1] * 8.0
Ks[:, 0, 2] = intrinsics[0, 2] * 8.0
Ks[:, 1, 2] = intrinsics[0, 3] * 8.0
Ks[:, 2, 2] = 1

# extrinsics
translations = cam_poses[:, :3] 
ts = np.zeros((cam_poses.shape[0], 3))
ts = translations

quaternions = cam_poses[:, 3:]
Rs = np.zeros((cam_poses.shape[0], 3, 3))
for idx in range(cam_poses.shape[0]):
    Rs[idx] = co.colmap.rotm_from_quat(quaternions[idx])

# get image size
images = np.load(str(in_dir / "images.npy"))
height, width = images.shape[2:]

# rendering depth map
# for each cam pose frame (keyframe, maybe)

*** 
load saved point cloud via open3d
***

pcd = o3d.io.read_point_cloud(str(pcd_path))
verts = np.asarray(pcd.points).astype(np.float32)
colors = np.asarray(pcd.colors).astype(np.float32)

for K, R, t in zip(tqdm(Ks), Rs, ts):
    scene = pyrender.Scene()
    mesh = pyrender.Mesh.from_points(verts, colors=colors)
    mesh_node = pyrender.Node(mesh=mesh, matrix=np.eye(4))
    scene.add_node(mesh_node)

    cam = pyrender.IntrinsicsCamera(
        fx=K[0, 0],
        fy=K[1, 1],
        cx=K[0, 2],
        cy=K[1, 2],
        znear=znear,
        zfar=zfar,
    )
    T = np.eye(4)

    T[:3, :3] = R
    T[:3, 3] = (t.reshape(3, 1)).ravel()
    cv2gl = np.array(
        [[-1, 0, 0, 0], [0, 1, 0, 0], [0, 0, -1, 0], [0, 0, 0, 1]]
    )
    T = T @ cv2gl
    cam_node = pyrender.Node(camera=cam, matrix=T)
    scene.add_node(cam_node)

    light = pyrender.DirectionalLight(color=np.ones(3), intensity=3)
    light_node = pyrender.Node(light=light, matrix=np.eye(4))
    scene.add_node(light_node, parent_node=cam_node)

    render = pyrender.OffscreenRenderer(width, height, point_size=point_size)
    _, depth = render.render(scene, flags=pyrender.RenderFlags.RGBA)

I wrote this code to render depth maps from point cloud, traj_est.npy, tsamps.npy, and intrinsics.npy

Wjt-shift commented 1 year ago

I have same problem,Have you solved it ?

Hi, It's a bit late, but FYI, I kind of solved it.

What I wanted is rendering depth map. What we have is point cloud and camera trajectory, from DROID-SLAM.

To render depth maps using pyrender, traj_est.npy is what I needed, not poses.npy

For pyrender,

intrinsics = np.load(str(in_dir / "intrinsics.npy")) # downscaled 
tstamps = np.load(str(in_dir / "tstamps.npy")).astype(np.int32)
traj_est = np.load(str(in_dir / "traj_est.npy"))
cam_poses = traj_est[tstamps]

# intrinsics
Ks = np.zeros((cam_poses.shape[0], 3, 3))   
Ks[:, 0, 0] = intrinsics[0, 0] * 8.0
Ks[:, 1, 1] = intrinsics[0, 1] * 8.0
Ks[:, 0, 2] = intrinsics[0, 2] * 8.0
Ks[:, 1, 2] = intrinsics[0, 3] * 8.0
Ks[:, 2, 2] = 1

# extrinsics
translations = cam_poses[:, :3] 
ts = np.zeros((cam_poses.shape[0], 3))
ts = translations

quaternions = cam_poses[:, 3:]
Rs = np.zeros((cam_poses.shape[0], 3, 3))
for idx in range(cam_poses.shape[0]):
    Rs[idx] = co.colmap.rotm_from_quat(quaternions[idx])

# get image size
images = np.load(str(in_dir / "images.npy"))
height, width = images.shape[2:]

# rendering depth map
# for each cam pose frame (keyframe, maybe)

*** 
load saved point cloud via open3d
***

pcd = o3d.io.read_point_cloud(str(pcd_path))
verts = np.asarray(pcd.points).astype(np.float32)
colors = np.asarray(pcd.colors).astype(np.float32)

for K, R, t in zip(tqdm(Ks), Rs, ts):
    scene = pyrender.Scene()
    mesh = pyrender.Mesh.from_points(verts, colors=colors)
    mesh_node = pyrender.Node(mesh=mesh, matrix=np.eye(4))
    scene.add_node(mesh_node)

    cam = pyrender.IntrinsicsCamera(
        fx=K[0, 0],
        fy=K[1, 1],
        cx=K[0, 2],
        cy=K[1, 2],
        znear=znear,
        zfar=zfar,
    )
    T = np.eye(4)

    T[:3, :3] = R
    T[:3, 3] = (t.reshape(3, 1)).ravel()
    cv2gl = np.array(
        [[-1, 0, 0, 0], [0, 1, 0, 0], [0, 0, -1, 0], [0, 0, 0, 1]]
    )
    T = T @ cv2gl
    cam_node = pyrender.Node(camera=cam, matrix=T)
    scene.add_node(cam_node)

    light = pyrender.DirectionalLight(color=np.ones(3), intensity=3)
    light_node = pyrender.Node(light=light, matrix=np.eye(4))
    scene.add_node(light_node, parent_node=cam_node)

    render = pyrender.OffscreenRenderer(width, height, point_size=point_size)
    _, depth = render.render(scene, flags=pyrender.RenderFlags.RGBA)

I wrote this code to render depth maps from point cloud, traj_est.npy, tsamps.npy, and intrinsics.npy

Hi,thanks for your reply!I want to know if you eval the traj_est with groundtruth.Is the traj_est have scale?and the traj_est pose is camera to world?i use evo eval traj_est with groundtruth poses but get large error.

gyes00205 commented 1 year ago

Hi @jugwangjin ,

Thanks for your sharing. I have a question about how to save the point cloud. Because in the demo.py it only save tstamps, images, disps, poses and intrinsics. https://github.com/princeton-vl/DROID-SLAM/blob/8016d2b9b72b101a3e9ac804ebf20b4c654dc291/demo.py#L72-L77 I try to save point cloud by adding the code below.

if args.reconstruction_path is not None:
    save_dir = os.path.join(f"reconstructions/{args.reconstruction_path}")
    os.makedirs(save_dir, exist_ok=True)
    vis.capture_screen_image(f"{os.path.join(save_dir, 'scene.png')}")
    vis.capture_depth_point_cloud(f"{os.path.join(save_dir, 'scene.pcd')}")

But my point cloud doesn't have color and camera frustum is also in my point cloud. image

Thanks in advance.

jugwangjin commented 1 year ago

Hi @jugwangjin ,

Thanks for your sharing. I have a question about how to save the point cloud. Because in the demo.py it only save tstamps, images, disps, poses and intrinsics.

https://github.com/princeton-vl/DROID-SLAM/blob/8016d2b9b72b101a3e9ac804ebf20b4c654dc291/demo.py#L72-L77

I try to save point cloud by adding the code below.

if args.reconstruction_path is not None:
    save_dir = os.path.join(f"reconstructions/{args.reconstruction_path}")
    os.makedirs(save_dir, exist_ok=True)
    vis.capture_screen_image(f"{os.path.join(save_dir, 'scene.png')}")
    vis.capture_depth_point_cloud(f"{os.path.join(save_dir, 'scene.pcd')}")

But my point cloud doesn't have color and camera frustum is also in my point cloud. image

Thanks in advance.

Hi, @gyes00205 .

I think you could use open3d.geometry.LineSet.create_camera_visualization. You have camera extrinsics, so it might be simple to create a list of open3d.geometry.LineSet and merge with your point cloud.

gyes00205 commented 1 year ago

Hi @jugwangjin

Thanks for your replying, I fixed this problem. image

Rtut654 commented 11 months ago

@gyes00205 @jugwangjin Hey! I know it is a bit offtop, but have you trained the model without depth, only with raw images and poses? This issue states some troubles doing so. I wonder how to implement it if it's possible. Thanks in advance.

gyes00205 commented 11 months ago

Hi @Rtut654 , I haven't train the model, I only use their pre-trained weights to obtain camera pose in my work.