Guanjun Wu 1, Taoran Yi 2, Jiemin Fang 3‡, Lingxi Xie 3 , Xiaopeng Zhang 3 , Wei Wei 1 ,Wenyu Liu 2 , Qi Tian 3 , Xinggang Wang 2‡✉
1 School of CS, HUST 2 School of EIC, HUST 3 Huawei Inc.
* Equal Contributions. $\ddagger$ Project Lead. ✉ Corresponding Author.
Our method converges very quickly and achieves real-time rendering speed.
New Colab demo: (Thanks Tasmay-Tibrewal )
Old Colab demo: (Thanks camenduru.)
Light Gaussian implementation: This link (Thanks pablodawson)
2024.6.25: we clean the code and add an explanation of the parameters.
2024.3.25: Update guidance for hypernerf and dynerf dataset.
2024.03.04: We change the hyperparameters of the Neu3D dataset, corresponding to our paper.
2024.02.28: Update SIBR viewer guidance.
2024.02.27: Accepted by CVPR 2024. We delete some logging settings for debugging, the corrected training time is only 8 mins (20 mins before) in D-NeRF datasets and 30 mins (1 hour before) in HyperNeRF datasets. The rendering quality is not affected.
Please follow the 3D-GS to install the relative packages.
git clone https://github.com/hustvl/4DGaussians
cd 4DGaussians
git submodule update --init --recursive
conda create -n Gaussians4D python=3.7
conda activate Gaussians4D
pip install -r requirements.txt
pip install -e submodules/depth-diff-gaussian-rasterization
pip install -e submodules/simple-knn
In our environment, we use pytorch=1.13.1+cu116.
For synthetic scenes: The dataset provided in D-NeRF is used. You can download the dataset from dropbox.
For real dynamic scenes: The dataset provided in HyperNeRF is used. You can download scenes from Hypernerf Dataset and organize them as Nerfies.
Meanwhile, Plenoptic Dataset could be downloaded from their official websites. To save the memory, you should extract the frames of each video and then organize your dataset as follows.
├── data
│ | dnerf
│ ├── mutant
│ ├── standup
│ ├── ...
│ | hypernerf
│ ├── interp
│ ├── misc
│ ├── virg
│ | dynerf
│ ├── cook_spinach
│ ├── cam00
│ ├── images
│ ├── 0000.png
│ ├── 0001.png
│ ├── 0002.png
│ ├── ...
│ ├── cam01
│ ├── images
│ ├── 0000.png
│ ├── 0001.png
│ ├── ...
│ ├── cut_roasted_beef
| ├── ...
For multipleviews scenes: If you want to train your own dataset of multipleviews scenes, you can orginize your dataset as follows:
├── data
| | multipleview
│ | (your dataset name)
│ | cam01
| ├── frame_00001.jpg
│ ├── frame_00002.jpg
│ ├── ...
│ | cam02
│ ├── frame_00001.jpg
│ ├── frame_00002.jpg
│ ├── ...
│ | ...
After that, you can use the multipleviewprogress.sh
we provided to generate related data of poses and pointcloud.You can use it as follows:
bash multipleviewprogress.sh (youe dataset name)
You need to ensure that the data folder is organized as follows after running multipleviewprogress.sh:
├── data
| | multipleview
│ | (your dataset name)
│ | cam01
| ├── frame_00001.jpg
│ ├── frame_00002.jpg
│ ├── ...
│ | cam02
│ ├── frame_00001.jpg
│ ├── frame_00002.jpg
│ ├── ...
│ | ...
│ | sparse_
│ ├── cameras.bin
│ ├── images.bin
│ ├── ...
│ | points3D_multipleview.ply
│ | poses_bounds_multipleview.npy
For training synthetic scenes such as bouncingballs
, run
python train.py -s data/dnerf/bouncingballs --port 6017 --expname "dnerf/bouncingballs" --configs arguments/dnerf/bouncingballs.py
For training dynerf scenes such as cut_roasted_beef
, run
# First, extract the frames of each video.
python scripts/preprocess_dynerf.py --datadir data/dynerf/cut_roasted_beef
# Second, generate point clouds from input data.
bash colmap.sh data/dynerf/cut_roasted_beef llff
# Third, downsample the point clouds generated in the second step.
python scripts/downsample_point.py data/dynerf/cut_roasted_beef/colmap/dense/workspace/fused.ply data/dynerf/cut_roasted_beef/points3D_downsample2.ply
# Finally, train.
python train.py -s data/dynerf/cut_roasted_beef --port 6017 --expname "dynerf/cut_roasted_beef" --configs arguments/dynerf/cut_roasted_beef.py
For training hypernerf scenes such as virg/broom
: Pregenerated point clouds by COLMAP are provided here. Just download them and put them in to correspond folder, and you can skip the former two steps. Also, you can run the commands directly.
# First, computing dense point clouds by COLMAP
bash colmap.sh data/hypernerf/virg/broom2 hypernerf
# Second, downsample the point clouds generated in the first step.
python scripts/downsample_point.py data/hypernerf/virg/broom2/colmap/dense/workspace/fused.ply data/hypernerf/virg/broom2/points3D_downsample2.ply
# Finally, train.
python train.py -s data/hypernerf/virg/broom2/ --port 6017 --expname "hypernerf/broom2" --configs arguments/hypernerf/broom2.py
For training multipleviews scenes,you are supposed to build a configuration file named (you dataset name).py under "./arguments/mutipleview",after that,run
python train.py -s data/multipleview/(your dataset name) --port 6017 --expname "multipleview/(your dataset name)" --configs arguments/multipleview/(you dataset name).py
For your custom datasets, install nerfstudio and follow their COLMAP pipeline. You should install COLMAP at first, then:
pip install nerfstudio
# computing camera poses by colmap pipeline
ns-process-data images --data data/your-data --output-dir data/your-ns-data
cp -r data/your-ns-data/images data/your-ns-data/colmap/images
python train.py -s data/your-ns-data/colmap --port 6017 --expname "custom" --configs arguments/hypernerf/default.py
You can customize your training config through the config files.
Also, you can train your model with checkpoint.
python train.py -s data/dnerf/bouncingballs --port 6017 --expname "dnerf/bouncingballs" --configs arguments/dnerf/bouncingballs.py --checkpoint_iterations 200 # change it.
Then load checkpoint with:
python train.py -s data/dnerf/bouncingballs --port 6017 --expname "dnerf/bouncingballs" --configs arguments/dnerf/bouncingballs.py --start_checkpoint "output/dnerf/bouncingballs/chkpnt_coarse_200.pth"
# finestage: --start_checkpoint "output/dnerf/bouncingballs/chkpnt_fine_200.pth"
Run the following script to render the images.
python render.py --model_path "output/dnerf/bouncingballs/" --skip_train --configs arguments/dnerf/bouncingballs.py
You can just run the following script to evaluate the model.
python metrics.py --model_path "output/dnerf/bouncingballs/"
There are some helpful scripts, please feel free to use them.
export_perframe_3DGS.py
:
get all 3D Gaussians point clouds at each timestamps.
usage:
python export_perframe_3DGS.py --iteration 14000 --configs arguments/dnerf/lego.py --model_path output/dnerf/lego
You will a set of 3D Gaussians are saved in output/dnerf/lego/gaussian_pertimestamp
.
weight_visualization.ipynb
:
visualize the weight of Multi-resolution HexPlane module.
merge_many_4dgs.py
:
merge your trained 4dgs.
usage:
export exp_name="dynerf"
python merge_many_4dgs.py --model_path output/$exp_name/sear_steak
colmap.sh
:
generate point clouds from input data
bash colmap.sh data/hypernerf/virg/vrig-chicken hypernerf
bash colmap.sh data/dynerf/sear_steak llff
Blender format seems doesn't work. Welcome to raise a pull request to fix it.
downsample_point.py
:downsample generated point clouds by sfm.
python scripts/downsample_point.py data/dynerf/sear_steak/colmap/dense/workspace/fused.ply data/dynerf/sear_steak/points3D_downsample2.ply
In my paper, I always use colmap.sh
to generate dense point clouds and downsample it to less than 40000 points.
Here are some codes maybe useful but never adopted in my paper, you can also try it.
Welcome to also check out these awesome concurrent/related works, including but not limited to
Deformable 3D Gaussians for High-Fidelity Monocular Dynamic Scene Reconstruction
SC-GS: Sparse-Controlled Gaussian Splatting for Editable Dynamic Scenes
MD-Splatting: Learning Metric Deformation from 4D Gaussians in Highly Deformable Scenes
4DGen: Grounded 4D Content Generation with Spatial-temporal Consistency
Diffusion4D: Fast Spatial-temporal Consistent 4D Generation via Video Diffusion Models
DreamGaussian4D: Generative 4D Gaussian Splatting
EndoGaussian: Real-time Gaussian Splatting for Dynamic Endoscopic Scene Reconstruction
EndoGS: Deformable Endoscopic Tissues Reconstruction with Gaussian Splatting
Endo-4DGS: Endoscopic Monocular Scene Reconstruction with 4D Gaussian Splatting
This project is still under development. Please feel free to raise issues or submit pull requests to contribute to our codebase.
Some source code of ours is borrowed from 3DGS, K-planes, HexPlane, TiNeuVox, Depth-Rasterization. We sincerely appreciate the excellent works of these authors.
We would like to express our sincere gratitude to @zhouzhenghong-gt for his revisions to our code and discussions on the content of our paper.
Some insights about neural voxel grids and dynamic scenes reconstruction originate from TiNeuVox. If you find this repository/work helpful in your research, welcome to cite these papers and give a ⭐.
@InProceedings{Wu_2024_CVPR,
author = {Wu, Guanjun and Yi, Taoran and Fang, Jiemin and Xie, Lingxi and Zhang, Xiaopeng and Wei, Wei and Liu, Wenyu and Tian, Qi and Wang, Xinggang},
title = {4D Gaussian Splatting for Real-Time Dynamic Scene Rendering},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2024},
pages = {20310-20320}
}
@inproceedings{TiNeuVox,
author = {Fang, Jiemin and Yi, Taoran and Wang, Xinggang and Xie, Lingxi and Zhang, Xiaopeng and Liu, Wenyu and Nie\ss{}ner, Matthias and Tian, Qi},
title = {Fast Dynamic Radiance Fields with Time-Aware Neural Voxels},
year = {2022},
booktitle = {SIGGRAPH Asia 2022 Conference Papers}
}