camera poses - Githubissues

MatteoMarengo commented 2 weeks ago

Very nice work !

Did you use datasets camera poses or COLMAP reconstruction to train the model?

Thank you :)

ziyc commented 2 weeks ago

Thank you for your interest in our work!

We used the calibrated camera poses provided by the driving datasets, which is a standard practice for driving scene reconstruction methods.

MatteoMarengo commented 2 weeks ago

Thank you for the rapid answer !

Did not you have any issues regarding calibration of these camera poses on some datasets ? Furthermore, how did the get camera poses for NuScenes ? Did you have to go through NuScenes devkit ?

ziyc commented 2 weeks ago

Hi @MatteoMarengo, that's very good question!

Regarding camera calibration for each dataset, we need to be mindful of their pose conventions, which might differ from our rasterization convention. Additionally, the camera pose calibration format varies slightly across datasets. Some save the ego poses and camera-to-ego poses, while others lack ego poses and only save camera poses. These are some key points that come to my mind.

For NuScenes specifically, we use the official devkit to extract pose information from their raw data.

For more details on how we process and solve problems related to poses, LiDAR, object annotations, and how we use the devkit for each dataset, you can also refer to our code. Here are some helpful resources:

https://github.com/ziyc/drivestudio/blob/main/docs/NuScenes.md https://github.com/ziyc/drivestudio/blob/main/datasets/nuscenes/nuscenes_preprocess.py

I hope this response helps answer your question. Thank you again for your interest!

MatteoMarengo commented 1 week ago

Thank you for the detailed answer !

And an additional question, how did you get such good results (PSNR > 25) for Waymo or Nuscenes scene reconstruction only using 3D GS ? I mean there was no control of dynamic in 3DGS so how, even with moving vehicles it works so well ?

ziyc commented 1 week ago

Hi Matteo!

3DGS is excellent at rendering high-fidelity photorealistic images with really high speed. But using 3DGS alone isn't enough for reconstructing dynamic scenes, as it's a static representation. Our method (along with many existing dynamic urban scene recon methods) utilizes the scene graph approach, first proposed by Neural Scene Graph.

The core idea of scene graph is to model each dynamic object separately and move these objects with their bounding boxes. Thus, our method doesn't use a single 3DGS model, but hundreds of 3DGS models within one framework to model different instances, e.g., background, vehicles, pedestrians.

Sounds like a heavy system, right? But in practice, Gaussians of all instances are optimized together, making the whole framework very efficient. If you're interested in how we model dynamic scenes with 3DGS, you'll find detailed answers in our paper: OmniRe: Omni Urban Scene Reconstruction

MatteoMarengo commented 1 week ago

Nice thank you so much for this detailed overview !

Therefore when you compare 3DGS with other methods in your benchmarks in OmniRe, it is static 3DGS right, like right from Inria's repo? (Exemple page 8, waymo dataset comparison)

ziyc commented 1 week ago

Yes, when comparing with '3DGS', we are comparing with the vanilla static 3DGS representation, which serves as a very basic baseline. When conducting experiments for '3DGS', we didn't use inria's official repo since it doesn't incorporate lidar supervision. To ensure a fair comparison, we used our re-implemented '3DGS' with LiDAR supervision added.

16Huzeyu commented 1 week ago

@ziyc Very nice work ! When we tried to directly use the camera poses from datasets like nuScenes for reconstruction, we found that the built-in poses were not as accurate as expected, leading to relatively poor reconstruction quality. We were wondering if you've also encountered this issue and how you addressed it.

MatteoMarengo commented 1 week ago

@16Huzeyu I encountered the same issue that is why I am asking for vanilla 3DGS :), therefore very interested in having the answer too !

Did you have good reconstruction in the end ? If yes which adaptation did you add ?

Thanks

ziyc commented 1 week ago

Hi @16Huzeyu @MatteoMarengo, Are you experimenting with your own code or DriveStudio for the nuScenes experiments?

We encountered the same issue - the GT camera poses from nuScenes are not very accurate. (As can be observed in our project page, the results on nuscenes show more visible artifacts, which is mainly caused by inaccurate poses of images and lidar)

Particularly, nuScenes' key frames are synchronized at 2Hz, which we interpolate to 10Hz by finding the closest image and LiDAR data for each interpolated timestamp. This further introduces errors to poses.

To address the pose issue, which may affect other datasets as well, we implemented a camera pose optimization module (referring to the gsplat), which should help improve the pose accuracy and reconstruction quality. You can try this module: experiment with different learning rates for the module; for less accurate poses like those in nuScenes, you might need a slightly higher learning rate than the default to optimize camera poses effectively.

For reference, here are the relevant parts of our camera pose optimization module in our codebase: https://github.com/ziyc/drivestudio/blob/b134d801054645da9bb944877fce12d7875af13c/configs/omnire.yaml#L253-L258 https://github.com/ziyc/drivestudio/blob/b134d801054645da9bb944877fce12d7875af13c/models/modules.py#L266-L317 https://github.com/ziyc/drivestudio/blob/b134d801054645da9bb944877fce12d7875af13c/models/trainers/base.py#L328-L329

MatteoMarengo commented 1 week ago

Thanks for all these details I will give it a try. Furthermore, what are the hyperparameters used ? Did you use default configuration or did you adapt learning rates / nb of iterations ?

ziyc commented 1 week ago

default setting in the config.

16Huzeyu commented 1 week ago

@ziyc Hello,How much PSNR metrics can camera pose optimization module boost on Nuscenes?

ziyc commented 1 week ago

Hi, @16Huzeyu, our quick test on NuScenes showed about 1 PSNR improvement with default learning rate of campose refine module. The current learning rate may not be optimal for handling noisy poses in the NuScenes, thus I think it could bring more improvement in NuScenes. The current boost is already quite promising.

ziyc / drivestudio

camera poses #2