Quesion about camera poses?

phongnhhn92 commented 8 months ago

Congrats on your paper ! I have two quick questions related to camera pose since your method does not rely on SFM pointcloud as the initialisation.

How can we obtain the camera poses ? I suppose you guys also run colmap on the training data to obtain the pose and use your proposed SLV random pointcloud initialisation to train the 3DGS model. I wonder if that is the case.
Can we optimize the camera poses if they are noisy using your method ? Or do you guys assume that they are fixed during training ?

Thanks, Phong !

crepejung00 commented 8 months ago

Hi, thank you for your interest in our work!

Yes you are correct! In this work, we started from poses from SfM + random point clouds. We found that the hard prerequisite of SfM initialized point clouds for reasonable performance, limits the application of 3DGS for various settings. So in this work, we first relaxed the need for initial point clouds and let 3DGS to successfully be trained only with accurate poses! We found that this relaxation is greatly helpful in settings starting from graphics pipeline such as Blender or text-to-3D settings where pose is known in advance. To also relax the need for accurate poses, we are currently working on noisy pose + RAIN-GS.

For the second question, we found that it definitely helps but in some cases naive application of our strategy can be tricky. So the answer can be yes and no in the same time. For detailed application of our strategy in noisy pose settings, we are planned to drop another paper soon so stay tuned!

Thanks!

resurgo97 commented 7 months ago

Hi. I want to ask an additional question regarding pose estimation. If you use COLMAP anyway, then what is the advantage of starting with random point clouds, when using point clouds generated from COLMAP still outperforms the case with random initialization?

Thanks.

crepejung00 commented 7 months ago

Hi, We agree that your question is something that arises very naturally. As mentioned in my previous answer, there exist multiple pipelines where the pose of the images can be obtained without COLMAP. This can contain settings with pre-calibrated cameras, and multiple graphics pipelines(Blender). Even with the pair of camera poses and images, vanilla 3DGS fails to render satisfying results without the point clouds generated from COLMAP. In order to generate high-quality renderings, users need to go through the COLMAP process even when the camera poses are known in advance. This is a huge limitation for those users since COLMAP normally takes over 30min ~ 2 hours depending on the number of images. In this work, we enable users to generate satisfying results only with camera poses and images! This significantly shortens the time of the overall pipeline I previously mentioned. In addition, our work is a stepping stone to training 3DGS with pose-free few-shots or noisy poses where the COLMAP struggles to converge. We found that our method works well when used for jointly optimizing the pose and Gaussian Parameters, and currently planning to release the code and paper for this extension around May.

Thanks.

resurgo97 commented 7 months ago

That totally makes sense. Thanks for your reply.

chenqi13814529300 commented 3 weeks ago

I am looking forward to high-quality Gaussian reconstruction using slightly less accurate poses and sparse/dense point clouds. If you have seen any good works, please let me know.

cvlab-kaist / RAIN-GS

Quesion about camera poses? #5