facebookresearch / vggsfm

[CVPR 2024 Highlight] VGGSfM Visual Geometry Grounded Deep Structure From Motion
Other
471 stars 30 forks source link

About training code and differentiable ba implementation #20

Open mkocabas opened 2 weeks ago

mkocabas commented 2 weeks ago

Hi @jytime,

Awesome work! Do you have any plans to release the training code?

I am specifically interested in how you used the theseus library to implement differentiable BA.

jytime commented 2 weeks ago

Hi @mkocabas,

We do plan to release the training code, but it is not our top priority at the moment. Our recent focus is to resolve the memory issue and provide a version that supports video input (as mentioned in https://github.com/facebookresearch/vggsfm/issues/9). We hope they will facilitate easier off-the-shelf usage.

Regarding Theseus, their team provides an official example for differentiable Bundle Adjustment, available here: https://github.com/facebookresearch/theseus/blob/e07569138cafa2d5da3e16ff2586d2495e77817c/examples/bundle_adjustment.py#L4

While this implementation isn’t fast, it functions correctly.

Please let me know if there is anything else I can help.

mkocabas commented 2 weeks ago

Thanks a lot for your prompt response!

I am looking forward to your training code.

I am aware of the example implementation. But as you mentioned it is quite slow. I wonder which tricks you used to make it work faster. It would be immensely helpful if you can share your code snippet implementing those changes.

jytime commented 2 weeks ago

Hi, as far as I can remember now (they were conducted months ago sry), these points helped:

(a) Use the linear solver BaSpaCho. BaSpaCho was specially designed by the Theseus team for BA. You need to compile it from the source code. (b) A very time-consuming step is using the for loop to convert pytorch tensors to Theseus variables. it will be much faster if you can make it a batch operation. A trial I used before was https://github.com/facebookresearch/theseus/issues/565, please check the github issue discussion for details. (But plz notice that we did not use this implementation finally. Instead we modified the source codes of Theseus to achieve something similar) (c) During training, use a small number of 3D points and camera parameters. Subsampling 25% of the parameters for BA does not significantly impact performance but greatly improves speed.