naver / dust3r

DUSt3R: Geometric 3D Vision Made Easy
https://dust3r.europe.naverlabs.com/
Other
4.65k stars 515 forks source link

What if a partial set of poses is known? #54

Open dscho15 opened 3 months ago

dscho15 commented 3 months ago

Hey Naver,

First of all great work, it is very interesting to play around with!

I'm curious, if one knows a partial set of poses and focal lengths aforehand, how should one initialize the pose-graph?

Best regards

hturki commented 3 months ago

looking at the code, there are preset_pose and preset_focal functions that might do what you want?

dscho15 commented 3 months ago

I did overwrite the estimated focal lengths, but since I'm only given a partial set of poses I would like to sort of include it in the optimization problem as a landmark. But thanks for the headsup.

jerome-revaud commented 3 months ago

Internally, we have a slower version of the global alignment where you can force a partial initialisation of camera intrinsics and extrinsics (e.g. focals and poses). If you're interested, we could release this code. @yocabon

dscho15 commented 3 months ago

That would be a very nice feature 😁

mizeller commented 3 months ago

I'd also be interested in this code; specifically to be able to provide known camera intrinsics to the pipeline. I assume this additional prior should simplify the optimization procedure as well?

Thanks for the great work! Very interesting :-)

yocabon commented 3 months ago

I'll take a look.

yocabon commented 3 months ago

I added it in https://github.com/naver/dust3r/commit/4a414b6406e5b3da3278a97f8cef5acfa2959d0b Example usage:

    # here data in list of (intrinsics, pose)
    scene = global_aligner(output, device=device, mode=GlobalAlignerMode.ModularPointCloudOptimizer, fx_and_fy=True)

    scene.preset_pose([data[i][1].cpu().numpy() for i in range(1, 3)], [False, True, True])
    scene.preset_intrinsics([data[i][0].cpu().numpy() for i in range(1, 3)], [False, True, True])
    loss = scene.compute_global_alignment(init="mst", niter=niter, schedule=schedule, lr=lr)

Note, it won't work if you have only one known pose.

drdsgvo commented 2 months ago

I added it in 4a414b6 Example usage:

    # here data in list of (intrinsics, pose)
    scene = global_aligner(output, device=device, mode=GlobalAlignerMode.ModularPointCloudOptimizer, fx_and_fy=True)

    scene.preset_pose([data[i][1].cpu().numpy() for i in range(1, 3)], [False, True, True])
    scene.preset_intrinsics([data[i][0].cpu().numpy() for i in range(1, 3)], [False, True, True])
    loss = scene.compute_global_alignment(init="mst", niter=niter, schedule=schedule, lr=lr)

Note, it won't work if you have only one known pose.

After trying this I got the following error:

loss = scene.compute_global_alignment(init="mst", niter=niter, schedule=schedule, lr=lr) File "/home/km/.local/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 16, in decorate_autocast return func(*args, *kwargs) File "/home/km/python/check/dust3r/dust3r/cloud_opt/base_opt.py", line 304, in compute_global_alignment init_fun.init_minimum_spanning_tree(self, niter_PnP=niter_PnP) File "/home/km/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(args, **kwargs) File "/home/km/python/check/dust3r/dust3r/cloud_opt/init_im_poses.py", line 77, in init_minimum_spanning_tree return init_from_pts3d(self, pts3d, im_focals, im_poses) File "/home/km/python/check/dust3r/dust3r/cloud_opt/init_im_poses.py", line 84, in init_from_pts3d raise NotImplementedError("Would be simpler to just align everything afterwards on the single known pose") NotImplementedError: Would be simpler to just align everything afterwards on the single known pose

I used your code, put it into the demo code in the appropriate place and used the following data for preset_pose+intrinsics for 2 images: pose: torch.Tensor([ [[1,0.0,0,0],[0.0,1,0,0],[0,0,1,0],[0,0,0,0]], [[0.85,0.25,-0.45,0],[-0.1,1,0.4,0],[0.5,-0.3,1,0],[0,0,0,0]] ]) intrinsics: torch.Tensor([ [[685,0,256],[0,685,192],[0,0,1]], [[685,0,256],[0,685,192],[0,0,1]] ]) Am I doing something wrong?

I have no clue how to set the pose+intrinsic data for known camera positions (sorry for me being stupid, I need some advice as I am not an expert in 3D). What are the meanings of the values in the pose+intrinsics tensors?