dcharatan / flowmap

Code for "FlowMap: High-Quality Camera Poses, Intrinsics, and Depth via Gradient Descent" by Cameron Smith*, David Charatan*, Ayush Tewari, and Vincent Sitzmann
https://cameronosmith.github.io/flowmap/
MIT License
872 stars 84 forks source link

Depth as only free variable #43

Closed Davidyao99 closed 3 months ago

Davidyao99 commented 3 months ago

Thank you for the insightful work.

I found it interesting that expressing pose as a function of depth (instead of a separate optimizable variable) leads to significantly better performance. This is supported in your ablation study. image

I was wondering if you have some intuition / reasoning for why this is the case? Why does this lead to better results?

dcharatan commented 3 months ago

If the poses and depths are optimized as separate free variables, they can become "out of sync" in the sense that their scales don't agree (in other words, the translation component of the pose can be much larger or smaller than would be expected for the depth). When this happens, the optimization can become unstable. If you're curious, I would encourage you to run the ablation yourself and see the results logged to wandb/disk -- you'll see how the optimization often behaves a lot worse.

Davidyao99 commented 1 month ago

@dcharatan Sorry for opening this again but could you elaborate again on what you mean by the scales don't agree? I thought about this and was wondering if the optimization becomes more stable after using the "depth as only optimizable variable" because by using a closed form solution for pose, you decreased the complexity of the objective function. With both the pose and depth as optimizable variables, objective is highly non-linear.