graphdeco-inria / hierarchical-3d-gaussians

Official implementation of the SIGGRAPH 2024 paper "A Hierarchical 3D Gaussian Representation for Real-Time Rendering of Very Large Datasets"
Other
935 stars 87 forks source link

Does not produce good results when using real poses in metric scale (same problem in vanilla 3DGS) #38

Open kevintsq opened 3 months ago

kevintsq commented 3 months ago

I used real poses to reconstruct driving scenes, and dynamic objects were already masked out. I have examined these poses in COLMAP, and they should be correct. Even if the scene is not on a large scale, it does not produce good results. The reconstructed scene is blurry and has some floaters. The situation is even worse if I use the vanilla 3DGS, and lowering the position lr, scale lr, percent_dense, density grad threshold, and disabling pruning to increase the number of Gaussians also won't produce good results. However, if I use poses from COLMAP, even default parameters produce good results and only require a small number of Gaussians. I've seen some works in driving scenes, and the real poses they used are almost all scaled to align with the COLMAP scale. I'm wondering why. Is 3DGS sensitive to the scale of poses? How can this problem be solved? Thanks.

Hierarchical 3DGS using real pose: image

Vanilla 3DGS using real pose and different parameters (# Points from 100000+ to 6000000+):

Vanilla 3DGS using COLMAP pose and default parameters (# Points = 227803):

40d8e95ab8d9ff3037fa1ea986d4352 d2f88a43929e8f9140da513cc640a33
ameuleman commented 2 months ago

Hi, Did you align the poses (see here, under Using calibrated images)?

kevintsq commented 2 months ago

Yes, I aligned the poses.

ameuleman commented 2 months ago

With the auto aligner, poses should be scaled appropriately. I suspect that the poses you are using are noisy

kevintsq commented 2 months ago

Thanks for your reply. The poses are obtained from GPS/IMU, so they are on a metric scale, and the starting point isn't the origin of the world coordinate system. Since I cannot fully assess the reliability of the poses, if this isn't an internal issue with the Gaussian method, it may be due to the noisy nature of the poses.

Here are the poses before alignment (zoomed out several times to make them visible in the frame): poses before alignment When we take a closer look, it seems fine: a closer look

These are the poses after alignment (appearing in the frame without needing to zoom out): poses after alignment Taking a closer look, they appear almost the same: a closer look

If I don't skip the bundle adjustment here, the rendered images are sharper, but there are some artifacts: bundle adjustment render result Taking a closer look, the poses have been changed (they are supposed to be on the same plane): image The following is the result using --skip_bundle_adjustment: image

One of my colleagues mentioned that in one case, there was no difference between the poses before and after BA, and the rendering results were also the same. In another case, where the poses were inaccurate, the global ATE differed by about 7 cm before and after BA, and the PSNR differed by around 4. Therefore, I also suspect that the poses in my dataset may be inaccurate if this isn't an internal issue with the Gaussian method.

GeJintian commented 1 month ago

Hello, have you managed to figure out the reason why your result is not good? Is it because of the inaccuracy of your GPS poses, or is it related to the "scale" of poses? I obtained poses from lidar odom, and generated colmap-like SfM points from lidar points. But the result is still blurry with vallina 3DGS, and I wonder if our cases are due to the same reason?

kevintsq commented 1 month ago

Maybe it is mainly due to the inaccuracy of the GPS/IMU poses because if I add two interactions of bundle adjustment after reconstructing from GT poses, the results become better.

GeJintian commented 1 month ago

Maybe it is mainly due to the inaccuracy of the GPS/IMU poses because if I add two interactions of bundle adjustment after reconstructing from GT poses, the results become better.

Got it. Thank you so much for your reply!