hustvl / 4DGaussians

[CVPR 2024] 4D Gaussian Splatting for Real-Time Dynamic Scene Rendering
https://guanjunwu.github.io/4dgs/
Apache License 2.0
2.22k stars 178 forks source link

Failure on custom data either dynamic or static #38

Closed YiLin32Wang closed 11 months ago

YiLin32Wang commented 11 months ago

Hi there,

Thanks for the great work!!

I've been trying to use my own custom data generated from Blender using animation from mixamo and sampling camera viewpoints. Input image frames are like this : frame_0001 I applied the arguments of dnerf dataset directly, and try best to match the setting of dnerf dataset: 800x800 resolution for each frames; sampling sparse camera views for trainset, testset and valset.

ModelHiddenParams = dict(
    kplanes_config = {
     'grid_dimensions': 2,
     'input_coordinate_dim': 4,
     'output_coordinate_dim': 32,
     'resolution': [64, 64, 64, 100] #[64, 64, 64, 50]
    }
)
OptimizationParams = dict(
    coarse_iterations = 3000,
    deformation_lr_init = 0.00016,
    deformation_lr_final = 0.0000016,
    deformation_lr_delay_mult = 0.01,
    grid_lr_init = 0.0016,
    grid_lr_final = 0.000016,
    iterations = 20000,
    pruning_interval = 8000,
    percent_dense = 0.01,
    # opacity_reset_interval=30000
)
ModelHiddenParams = dict(
    multires = [1, 2, 4, 8 ],
    defor_depth = 0,
    net_width = 64,
    plane_tv_weight = 0,
    time_smoothness_weight = 0,
    l1_time_planes =  0,
    weight_decay_iteration=0,
    bounds=1.6
)

But the thing is that during training, the PSNR and L1 Loss stays the same while the densification goes on, either at coarse stage or fine stage.

image image

And the rendered video is not working well:

https://github.com/hustvl/4DGaussians/assets/91527702/2ec17a1c-fee7-40b5-9d0a-04478d8a69b3

I also try fitting only the first frame, and it still got the same issue. The rendered video:

https://github.com/hustvl/4DGaussians/assets/91527702/05dd4090-384e-45e0-b3a9-d54dcaacc7b5

Do you know what might cause the issue?

guanjunwu commented 11 months ago

see this issue maybe camera poses are wrong?

YiLin32Wang commented 11 months ago

see this issue maybe camera poses are wrong?

Thanks for the quick reply! I've changed the transform_matrix computation as what is suggested in your mentioned issue. But the results are not much improved, but only more dense this time. And I've doublecheck that the revised version still works for dnerf dataset.

https://github.com/hustvl/4DGaussians/assets/91527702/216f899b-a7a0-4be3-8b92-19618bc30467

YiLin32Wang commented 11 months ago

I've already resolved this problem. It was the world-to-camera matrix that I exported from Blender, which is taken as the camera-to-world matrix by the dataset_readers.py. After I inverted it, the model can converges reasonably.