Can the camera poses reconstructed by colmap be used as input for training?

nianticlabs / ace

[CVPR 2023 - Highlight] Accelerated Coordinate Encoding (ACE): Learning to Relocalize in Minutes using RGB and Poses

https://nianticlabs.github.io/ace

Other

354 stars 35 forks source link

Can the camera poses reconstructed by colmap be used as input for training? #36

Closed GottenZZP closed 2 weeks ago

GottenZZP commented 2 weeks ago

Thank you very much for open-sourcing this excellent model. I would like to know if the image poses restored using the Colmap method can be used in your method. In other words, can I first capture a photo dataset of the scene with a mobile phone, then use Colmap for 3D reconstruction, and finally convert the poses and focal lengths from the extrinsic parameters and intrinsic parameters obtained during the reconstruction process into the format of your dataset for training? Currently, when I use the camera poses restored by Colmap for training, the results are not very good, with rotation errors of around 1 degree and position errors of tens or even hundreds of centimeters.

ebrach commented 2 weeks ago

Hi! Yes, that is possible. E.g. this is how our Wayspots dataset was created as well. There are various possible reasons for the errors you observe. It seems to work in principle, otherwise the errors would be much worse. Potential reasons for inaccuracy: The dataset is difficult for ACE, e.g. due to a large spatial extend or repeating structures. There is a mismatch in calibration parameters between COLMAP and ACE, e.g. if you reconstructed with radial distortion in COLMAP while ACE does not support radial distortion. If the scene is far away from the camera plane, there is high uncertainty in the position estimates (both for COLMAP and ACE). In general, be mindful that measuring errors wrt to COLMAP pseudo ground truth comes with it's own pitfalls: https://arxiv.org/abs/2109.00524

That is what comes to my mind :) Hope some of it is helpful. Eric

GottenZZP commented 2 weeks ago

Thank you for your prompt reply. I tried visualizing the training of the Ace method by adding the "--render_visualization" parameter, but found that using the pose parameters obtained from the 3D reconstruction in Colmap did not effectively model the point cloud in Ace (or did not model it at all). I am wondering if the large amount of reflection in my dataset environment is causing this issue (as my scene is an indoor environment with highly reflective materials on the floor and floor-to-ceiling windows). I noticed that your team seems to have released a new paper this year called AceZero, which appears to focus on SfM modeling. Could you please confirm if the pose_final.txt file produced after training with the new AceZero method can be well adapted to the current Ace method? Looking forward to your response.

ebrach commented 6 days ago

Hi! As replied also in the other repository, the ACE0 code contains the full ACE relocaliser, so you can reconstruct a scene using ACE0, and get an ACE relocaliser automatically (or train one with the pose file if that is needed).

A highly reflective environment could very well cause problems for ACE. Give ACE0 a try, enable the visualisation there as well and see whether is reconstructs something plausible. If not, the scene might just be too difficult for ACE/ACE0.

Best, Eric