facebookresearch / vggsfm

VGGSfM: Visual Geometry Grounded Deep Structure From Motion
Other
888 stars 63 forks source link

Reprojection error #70

Open joseTWD opened 2 weeks ago

joseTWD commented 2 weeks ago

Hi,

I've been playing around a bit with my own data and results are cool despite my lack of memory and resources! Congrats for the amazing work! I was wondering how could I retrieve the reprojection error to estimate the goodness of the scenes reconstruction. Could you please provide any advise or information about it? :)

jytime commented 2 weeks ago

Hey thanks for your kind words. If you can still have access to the logs of running, you can just capture the log output, which would have something like:

Final cost : 0.525241 [px]

If you don't have the access to logs any more, the simplest way should be:

mean_reproj_error = reconstruction.compute_mean_reprojection_error()

but please note that this will only work when your 2D points are in the correct resolution, i.e., when you running with shift_point2d_to_original_res=True

By the way, I personally think:

mean_track_len = reconstruction.compute_mean_track_length()

is a better and more reliable metric. And for this, you don't have to set shift_point2d_to_original_res=True.

joseTWD commented 2 weeks ago

It worked! Thank you very much for your quick answer, I spent several hours searching in the base code for any function that could give me any metric besides the logs. Also, this consideration about the track length compute is interesting. Why we can consider it more reliable than the reprojection error?

Based on the definition of track length (https://stackoverflow.com/questions/49421048/what-track-and-track-length-means-in-sfm I had to search it to be sure hahah) I don't really get why this length can provide you a reliable metric to estimate the goodness of the reconstruction. A larger value means a better reconstruction in all cases? What about tracks lengths that have to be smaller based on the video/image poses we use as input?

Thanks again for your response! :)

jytime commented 2 weeks ago

Hi,

I should clarify my earlier point. Usually a large mean_track_len / total_frame_num means a better reconstruction. This is because it suggests that the model is less uncertain about fewer images. In standard bundle adjustment processes, (I believe very common in open source sfm codebases), points exceeding certain error thresholds are removed or masked in subsequent runs of BA. As a result, the final mean reprojection error only accounts for a subset of the points—sometimes small, sometimes large. This can make the metric somewhat unreliable as an indicator of accuracy.

Additionally, if you have access to ground truth annotations for the scene, I recommend calculating the camera pose difference between the predicted and actual values. Let me know if you need assistance locating the code to perform this comparison.