princeton-vl / DPVO

Deep Patch Visual Odometry/SLAM
MIT License
608 stars 71 forks source link

About camera pose for Replica #44

Closed pengchongH closed 8 months ago

pengchongH commented 8 months ago

Thanks for your great work!

I tried to use your trained model on a scene of Replica, which is used in iMAP and NICE-SLAM. I noticed that the estimated camera pose is not on a real scale. I also visualized the sparse point cloud, which is smaller than the ground truth mesh. I would like to know if there are any mistakes I made in using your code.

Thanks for your help!

sparse point cloud

image

ground truth mesh

image
lahavlipson commented 8 months ago

iMAP and NICE-SLAM use RGB-D input, so they obtain the scene scale from the depth input. DPVO only uses RGB so the scale is unknown. If you have a depth map from Replica, you can compute a rescaling factor by aligning one of the DPVO depth maps to a Replica depth map.

pengchongH commented 8 months ago

Thanks for your reply. That makes sense. But I have no idea about how to get a DPVO depth for a specific frame because I noticed here you just use some patches, which include the inverse depth information, in your pipeline.

lahavlipson commented 8 months ago

The center of each 3x3 patch contains an inverse depth, and a pixel location (at 1/4 resolution). If you sample the associated replica depth map at this pixel location and multiply by the dpvo inverse depth, you will get the scale difference. The predicted patch depths may be somewhat noisy, so it would be best to do this for many patches and take the median scale difference.

pengchongH commented 8 months ago

I will try this. Thanks for your detailed illustration!

Dong09 commented 1 month ago

I will try this. Thanks for your detailed illustration!

Hello, have you found a way to calculate the depth map? I would be very grateful.