zju3dv / DetectorFreeSfM

Code for "Detector-Free Structure from Motion", CVPR 2024
Apache License 2.0
541 stars 26 forks source link

Metrics on translation and visualization #48

Closed ComputerMath closed 2 weeks ago

ComputerMath commented 2 weeks ago

Hello, Thanks for your great work!

I am curious about the metrics and visualization you put into the paper

image

One is that how you measured the error in translation vectors. In the triangulation, the ground truth pose (including translation and rotation) is given to build SfM model, so the error would be 0 in this case. In the other case of the reconstruction, the scale of the translation is unknown and rotation matrix's identity is unknown. So, how did you check the error values of translation vector in the camera pose in this figure?

image

Also, I can't understand how the figure above's red and blue camera's pose are determined and compared. As explained above, camera pose is determined by ground truth in triangulation, and scale and identity of translation vector and rotation is unknown in the reconstruction.

Can you explain in detail how you determined camera pose for drawing the figures?

thanks in advance :)

hxy-123 commented 2 weeks ago

Hi. Firstly, I clarify that all these visualizations are recovered poses by our SfM system instead of triangulation which uses known poses to recover 3D points. Since the recovered poses are in different world coordinates compared with GTs, we visualize them by first performing model alignment (using COLMAP API: colmap model_aligner). After alignment, both GTs and predictions are under the same world coordiate, we can directly calculate pose rotation error and translation error, as shown in our main pipeline figure (refined model).

ComputerMath commented 2 weeks ago

Thanks for your quick reply! I totally understood now :)