hustvl / 4DGaussians

[CVPR 2024] 4D Gaussian Splatting for Real-Time Dynamic Scene Rendering
https://guanjunwu.github.io/4dgs/
Other
1.96k stars 155 forks source link

Some help with hyperparameters please! #96

Open pablodawson opened 4 months ago

pablodawson commented 4 months ago

Hey, hope you're doing well. I've been experimenting with your codebase using my own captures, and converting them to the Dynerf structure. It seems to reconstruct the static scene well (same as standard 3DGS). However, the dynamic part is often ignored, maybe the movement is too large? What parameters are the most relevant to mitigate this? Attached is an example frame: The dynamic part (cyclist in this case) is almost invisible.

I'm guessing _time_smoothnessweight should play a part in this, not sure about the rest. Thanks.

00042 00042

guanjunwu commented 4 months ago
  1. you can enlarge the training iteration such as iterations = 60000, densify_until_iter = 30_000 or more.
  2. Actually, 4DGS is hard to learn the large motion such as bicycle, you can check the appendix of my paper.
  3. try to use the dense point clouds to initialize the 4D Gaussians.

I mark this issue as help wanted, hope anyone can join in the discussion to solve the large motion such as cyclist :)

azzarelli commented 2 months ago

I've been stuck on (and still am) avoiding overtraining scenes with sparse views. Through this there's a couple things I think I understand better -> Namely that the regularisers, except from the Plane TV, won't have a significant impact on the final result (unless the weights are really small/large). Essentially, if you are not getting "okay" results without the temporal regularisers, then using them/tuning them wont make a big difference.

As @guanjunwu alludes to, plane based methods are a lot more sensitive to the initialisation. So you may want to consider other methods (such as dust3R) to improve point-cloud initialisation, or perhaps increasing the bounding region and number of initial PCs. Additionally, for cases such as mine, lower learning rate and longer training time can help avoid overfitting (even if it takes very long to train).

hsilvaga commented 1 month ago

I've been stuck on (and still am) avoiding overtraining scenes with sparse views. Through this there's a couple things I think I understand better -> Namely that the regularisers, except from the Plane TV, won't have a significant impact on the final result (unless the weights are really small/large). Essentially, if you are not getting "okay" results without the temporal regularisers, then using them/tuning them wont make a big difference.

As @guanjunwu alludes to, plane based methods are a lot more sensitive to the initialisation. So you may want to consider other methods (such as dust3R) to improve point-cloud initialisation, or perhaps increasing the bounding region and number of initial PCs. Additionally, for cases such as mine, lower learning rate and longer training time can help avoid overfitting (even if it takes very long to train).

I'm encountering a similar issue with dense pointclouds, namely PCLs coming from Flowmap. Which learning rates did you lower? Was it all of them in general or just the ones related to 4DGS (i.e. deformation_lr / grid_lr)?