zju3dv / street_gaussians

[ECCV 2024] Street Gaussians: Modeling Dynamic Urban Scenes with Gaussian Splatting
Other
873 stars 50 forks source link

multiple cameras and depth estimation #7

Open szhang963 opened 10 months ago

szhang963 commented 10 months ago

Hi, I'm very interested in your great work. I have two questions to ask you.

  1. Is it more complex for multiple cameras input? Can it get the best reconstruction performance?
  2. What is the performance for dense depth estimation in the street gaussian? Thank you in advance.
yifanlu0227 commented 10 months ago

Multi-camera input is much more intractable. The vanilla 3DGS 's performance drops dramatically when using 5 cams.

szhang963 commented 10 months ago

What causes it? The inconsistent intrinsic parameters of the multiple cameras?

yunzhiy commented 10 months ago

Hi, I'm very interested in your great work. I have two questions to ask you.

  1. Is it more complex for multiple cameras input? Can it get the best reconstruction performance?
  2. What is the performance for dense depth estimation in the street gaussian? Thank you in advance.

Sorry for the late reply. I trained one Waymo sequence under multiple cameras(FRONT_LEFT, FRONT, FRONT_RIGHT) and the rendering and depth results are as shown below. Hope this can answer your question.

https://github.com/zju3dv/street_gaussians/assets/66961088/5f1c1019-2971-41f7-a64a-65ed6b5df59b

https://github.com/zju3dv/street_gaussians/assets/66961088/e3b6d2c6-26d0-4a4f-a278-8e7f8717f3eb

yifanlu0227 commented 10 months ago

@yunzhiy The result is great! Could you reveal some training parameters? Such as frame number and image resolution. Thanks!

yunzhiy commented 10 months ago

I use the same training parameters as in our paper.

pierremerriaux-leddartech commented 10 months ago

Hi @yunzhiy, thanks for the test, did you optimize one gsplat for the 3 cameras, or 3 independant gsplats for each camera? I did some tests with nerfstudio gsplat implementation with 5 cameras of pandaset, and the result was worst than with only one camera. thanks

yunzhiy commented 10 months ago

@pierremerriaux-leddartech I use one gsplat for 3 cameras.

pierremerriaux-leddartech commented 10 months ago

Great job @yunzhiy , do you want to share with what was the multicameras limitation in vanilla 3DGS ? have a good weekend

szhang963 commented 10 months ago

@yunzhiy It is excellent for the result of multiple cameras. I'm looking forward to the release of the code.

mumianyuxin commented 10 months ago

Multi-camera input is much more intractable. The vanilla 3DGS 's performance drops dramatically when using 5 cams.

@yifanlu0227 Why multi-cameras (5 or more) cause a sharp drop in effectiveness, could you please explain it?

ShaohuaL commented 9 months ago

Hi, I'm very interested in your great work. I have two questions to ask you.

  1. Is it more complex for multiple cameras input? Can it get the best reconstruction performance?
  2. What is the performance for dense depth estimation in the street gaussian? Thank you in advance.

Sorry for the late reply. I trained one Waymo sequence under multiple cameras(FRONT_LEFT, FRONT, FRONT_RIGHT) and the rendering and depth results are as shown below. Hope this can answer your question.

031-color.mp4 031-depth.mp4

Good job, could you please tell me how can I get in a 3D gaussian splatting model for dense depth estimation? @yunzhiy Thank you!

pierremerriaux-leddartech commented 9 months ago

Hi @yunzhiy,

About the video you publish above. Which camera pose did you considerate to compute gaussians projection and rasterization (the pose provided by waymo for each image is not in the middle of camera acquisition time) ? How did you manage with the fact that waymo cameras are rolling shutter? Because the acquisition time for the full image is around 40ms, and during this time the camera moved.

thanks

JiaxiongQ commented 7 months ago

@yunzhiy The depth map has obvious line effects, are these caused by taking Lidar point cloud as input?

yunzhiy commented 7 months ago

Hi. This line effect is caused by the sparse LiDAR depth surpervision during optimization. You can obtain a visually better depth map by applying some regularization loss (inverse depth smoothness loss etc.).

-----原始邮件----- 发件人:JiaxiongQ @.> 发送时间:2024-03-25 12:30:50 (星期一) 收件人: zju3dv/street_gaussians @.> 抄送: "Yunzhi Yan" @.>, Mention @.> 主题: Re: [zju3dv/street_gaussians] multiple cameras and depth estimation (Issue #7)

@yunzhiy The depth map has obvious line effects, are these caused by taking Lidar point cloud as input?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>