ingra14m / Deformable-3D-Gaussians

[CVPR 2024] Official implementation of "Deformable 3D Gaussians for High-Fidelity Monocular Dynamic Scene Reconstruction"
https://ingra14m.github.io/Deformable-Gaussians/
MIT License
772 stars 40 forks source link

rendering result #5

Closed RedemptYourself closed 7 months ago

RedemptYourself commented 8 months ago

Hi,I've reproduced the paper and it's actually not that difficult. However, although I have achieved the metrics shown in the paper, this should be due to the ability of 3dgs in static expression.In my experiments,it seems that quality of the dynamic part cannot achieve the effect shown in the article. Are there some details that are not mentioned in the article?or there is something wrong with my implementation. Looking forward to your reply, thanks.

ingra14m commented 8 months ago

Hi, thank you for your reproduction of our work.

In fact, the improvement in the rendering quality of Deformable-Gaussians is unrelated to the high-quality static part of 3D-GS. You can refer to Table 1 and Figure 3 in the paper for details.

Many methods that enhance NeRF rendering quality have already been applied to the modeling of monocular dynamic scenes, such as TiNeuVox. 3D-GS, as a method that combines real-time rendering and rendering quality (metrics similar to mip-nerf), should theoretically provide results for monocular dynamic scenes that are similar to or slightly better than previous methods, with the added benefit of faster rendering. I believe that the coupling of the Deformation Field with 3D-GS has raised the upper limit of 3D-GS. The artifacts in the depth maps are significantly reduced compared to the results of 3D-GS in static scenes.

As for the performance in the dynamic part, it might be related to the learning rate of the deformation field. You can refer to https://github.com/ingra14m/Deformable-3D-Gaussians/issues/3#issuecomment-1758855276 for more details.

RedemptYourself commented 8 months ago

thanks for your reply,but there are still some issues I want to confirm: 1.about the structure of deform net ,is that the structure like linear+activation(like relu) or just mlp? 2.Have you modified the densification code in 3dgs for dynamic? 3.I noticed that in #comment 3 , the gradient of deform net will not back to self.xyz, right? Could you give me some clues? Looking forward to your reply, thanks

RedemptYourself commented 8 months ago

I think I've experimented exactly as in the paper, but there seems to be some issues with the rendering and I'm a bit confused, can you give me some clues? Thanks individualImage

ingra14m commented 8 months ago

Hi, the critical source code has already been sent to you. By the way, I do not recommend using the D-NeRF Lego dataset, and the reasons for this are explained in the heading of Table 1. Best wishes for your research.

RedemptYourself commented 8 months ago

thanks for sharing ,I have found the difference, it is quiet interesting.

RedemptYourself commented 8 months ago

Hi, I would like to know if the experimental setup on the hypernerf dataset is also same with the paper? In my experiments, it can not be well rendered.

ingra14m commented 8 months ago

Hi, the processing of the HyperNeRF dataset is basically consistent with the paper. Since the point cloud and pose of the HyperNeRF dataset are not accurate, in some datasets, like 'broom', it indeed doesn't converge well. What's presented in the paper are datasets from HyperNeRF where the pose is relatively accurate.

However, don't worry, we have already found a suitable real-world monocular dataset to validate the effectiveness of our method. In the next version of the paper, HyperNeRF will only serve as a reference dataset. But for reasons that everyone understands, I don't want to reveal all the insights on monocular dynamic scenes based on 3D-GS + deformation field before the submission deadline.

miya9756 commented 3 months ago

Hi, thank you for sharing the impressive work. May I know the performance of your model when the number of Gaussians is lower? I think you may know the importance of the total number of Gaussians to PSNR values. I am quite curious about whether the model could still keep the render quality and viewpoint consistency when Gaussians are not that dense (as shown in the appendix). Many thanks!