Question about optimizing camera poses

ghost commented 2 months ago

Hello,

Thanks for your fancy work. I'm interested in the dynamic 3D from a monocular video. While reading the paper, I was curious about the part about optimizing camera pose. Since the current Gaussian Splatting renderer does not support gradient propagation of camera poses, it seems that you are not optimizing camera poses with the l1 loss of the rendered image and GT, as shown in Eq. 14 and 15.

However, I'm a little confused about "Note that we will also optimize the camera poses Wt throughout the rendering phases as well with photometric losses to further adjust the camera". I saw this was also mentioned in another issue. Does this mean that the camera pose is also optimized using Eq. 13 as loss? If so, have you implemented a renderer that can pass the camera pose gradient? If so, that's so cool.

JiahuiLei commented 2 months ago

thanks for your question.
The camera focal length is not optimized during rendering, but the camera poses are optimized during rendering by the l1 loss of rendered rgb images and depths.

The native GS renderer does support optimizing camera poses but does not support focal length gradient. Because the renderer can take input 5-tuple of gaussian parameters (mu, rot, scales, opacity and sph) no matter in which coordinate frame, and produces an image. So we can always manually transform these parameters (rotate and translate mu, rotate rot) outside the native renderer with pytorch auto-grad, and make the input camera pose to the renderer function always identical. In this way, we can optimize the camera poses. Hope this answer would be somehow useful, thank you.

ghost commented 2 months ago

Thanks for quick replying!

JiahuiLei / MoSca

Question about optimizing camera poses #3