SJoJoK / 3DGStream

[CVPR 2024 Highlight] Official repository for the paper "3DGStream: On-the-fly Training of 3D Gaussians for Efficient Streaming of Photo-Realistic Free-Viewpoint Videos".
https://sjojok.github.io/3dgstream
MIT License
302 stars 18 forks source link

Questions about the details of the initial training #15

Closed bbangsik13 closed 3 months ago

bbangsik13 commented 3 months ago

Thanks for your great work!

However, I have some questions about the initial training.

I cloned the code of gaussian-splatting, modified it so that test_cam becomes the 0th frame, and then ran it with the dataset you provided as follows: python train.py -s ../test/flame_steak_suite/frame000000/ -m outputs/flame_steak/frame000000 -r 2 --eval --sh_degree 1 --test_iteration 7_000 10_000 15_000 20_000 24_000 27_000 30_000 The result doesn't perform as well as the ply you provided.

The changes in test PSNR during training are as follows.

image

On the other hand, the test psnr of the ply you provided is 34.68.

Can you help me solve this problem?

SJoJoK commented 3 months ago

Hello,

In fact, the optimization of Gaussian-Splatting has some randomness (even if you fix the seed, please refer to https://github.com/graphdeco-inria/gaussian-splatting/issues/89)

The 3DGs file (.ply) we provided in this repo is a somehow good result after running multiple times of the optimization.

To improve the quality of the initial_3dgs, you can use any tricks, such as derive a suitable 3DGs file (.ply) from some works for static reconstruction(e.g., Scaffold-GS) or use the dense point cloud as suggested in the 4DG works.

Thanks.

bbangsik13 commented 3 months ago

Thank you for your response!

I will try multiple runs as you suggested. In fact, I have also experimented with a dense point cloud and ran the experiment about 10 times, but I still couldn't achieve performance close to the file you provided. 😢

To explain my situation in more detail, the test PSNR peaks at around 6000 iterations (about 32.5 dB) and then gradually decreases. Even when using a dense point cloud, the PSNR goes up to about 33.8 dB in 4000 iteration but then gradually deceases with some floaters(maybe it is a kind of overfitting). Therefore, if this issue persists, I'm considering stopping the training at the peak.

Additionally, it seems you performed hyperparameter tuning in the link you shared. Did you use the same settings when training the provided ply?(It's not mentioned in the README 🥲) If possible, could you share the initial 3DGS training details (whether you used SfM or MVS, the hyperparameters)?

Thanks.

SJoJoK commented 3 months ago

Yes, I can provide the cfg_args.json that I used for the reconstruction of the init_3dgs of flame_steak, maybe in next few days when I'm available. There's nothing special for this optimization, just some fine-tuning of hyper-parameters (However, you also need to do early-stop and run the optimization multiple times. Still, a result with PSNR>34 is not guaranteed.). I'll add a comment in this issue to notify you when I do so.

Another thing that I want to mention is that 3DGStream is designed for the dynamic reconstruction, which means you can try it on the init_3dgs that you've got now (In fact, a init_3dgs with PSNR=33.8 is good enough for the 3DGStream to reconstruct the dynamic scene for the next 299 frames). If you are conducting experiments for your paper, You can use the same init_3dgs to compare your method and ours. Certainly, it will be nice if you get an init_3dg with even better quality (no matter what tricks/method is used).

SJoJoK commented 3 months ago

Hi, This is the cfg_args.json that we used to train the init_3dgs of flame_steak

{
    "extent": 0,
    "sh_degree": 1,
    "source_path": "dataset/DyNeRF/frames/flame_steak/",
    "model_path": "output/steak_15k/sh_1",
    "output_path": "output/steak_15k/sh_1",
    "video_path": "",
    "ply_name": "points3D.ply",
    "images": "images_2",
    "resolution": 1,
    "white_background": false,
    "data_device": "cuda",
    "eval": true,
    "iterations": 30000,
    "iterations_s2": 0,
    "first_load_iteration": 15000,
    "position_lr_init": 0.00016,
    "position_lr_final": 1.6e-06,
    "position_lr_delay_mult": 0.01,
    "position_lr_max_steps": 30000,
    "feature_lr": 0.0025,
    "opacity_lr": 0.05,
    "scaling_lr": 0.005,
    "rotation_lr": 0.001,
    "percent_dense": 0.01,
    "lambda_dssim": 0.2,
    "depth_smooth": 0.0,
    "res_cache_lr": null,
    "lambda_dxyz": 0.0,
    "lambda_drot": 0.0,
    "densification_interval": 100,
    "opacity_reset_interval": 3000,
    "densify_from_iter": 500,
    "densify_until_iter": 15000,
    "densify_grad_threshold": 0.0002,
    "res_conf_path": "",
    "res_param_path": "",
    "batch_size": 1,
    "s2_type": "spawn",
    "s2_adding": false,
    "num_of_split": 1,
    "std_scale": 1,
    "min_opacity": 0.005,
    "rotate_sh": true,
    "convert_SHs_python": false,
    "compute_cov3D_python": false,
    "debug": false,
    "bwd_depth": false,
    "ip": "127.0.0.1",
    "port": 6009,
    "debug_from": -1,
    "load_iteration": null,
    "detect_anomaly": false,
    "test_iterations": [
        7000,
        10000,
        15000,
        20000,
        24000,
        27000,
        30000
    ],
    "save_iterations": [
        7000,
        10000,
        15000,
        20000,
        24000,
        27000,
        30000
    ],
    "quiet": false,
    "checkpoint_iterations": [],
    "start_checkpoint": null
}
bbangsik13 commented 3 months ago

Thank you for sharing the cfg file. I'll try it again!

The dynamic reconstruction with the provided init_3dgs works without any issues! It's just hard to get a good init 😂