facebookresearch / vggsfm

VGGSfM: Visual Geometry Grounded Deep Structure From Motion
Other
835 stars 53 forks source link

Does the Triangulator have learnable parameters during training? #47

Open OasisYang opened 1 month ago

OasisYang commented 1 month ago

Hi, thanks for releasing this great work! I've been examining your excellent work and have a question about the Triangulator.

I noticed it doesn't have any learnable parameters. Is this designed specifically for inference (given that the current inference process uses COLMAP bundle adjustment rather than a differentiable version)? If I were to fine-tune vggSfM on custom data, would I need to implement and train a Triangulator module with learnable parameters as described in your paper?

I appreciate your time and look forward to your response. Thanks!

jytime commented 1 month ago

Hi @OasisYang ,

Yes, the triangulator had learnable parameters when we trained our model, as detailed in our paper. However, in our release version, we switched to a formula-based triangulator. We made this change because the deep (learned) triangulator exhibited some weird behaviors with out-of-domain data, likely due to insufficient training on large-scale data. To simplify inference for all users, we temporarily removed the deep triangulator.

We plan to reintroduce the deep triangulator in VGGSfM version 2.1. Now, if you want to fine-tune our model, the simplest approach is to just fine-tune the camera predictor and tracker. This should yield reasonable performance due to the good initilization from our pretrained weights.

Best, Jianyuan