We initialize the third component of the 3DGS's scale to log(0), which effectively sets it to negative infinity (log(0)). Although log(0) is mathematically undefined, the subsequent exponential activation function (exp) maps negative infinity to 0. This means that after the initialization and activation, the third scale component becomes zero.
During optimization, the gradients for this third scale component are not explicitly stopped. However, since the activation function (exp) is applied, any changes to the pre-activation value (negative infinity) will still result in a zero output after the activation.
Consequently, the third scale component remains zero throughout the optimization process, effectively zeroing out the scaling along that axis.
The original distortion loss is defined as $\sum{i}\sum{j} w_i w_j |z_i - z_j|$. To calculate the distortion loss, a custom rasterizer is needed for computing the distortion map. To simplify the implementation, we approximate the original distortion loss by considering only the dominant weight. Our assumption is that if $j \neq argmax(w_j)$, then $w_i \times w_j \approx 0$, since one of the weights is not the maximum weight. Then, we can write the distortion loss in the form of:
$$ \begin{matrix} \sum{i}\sum{j} w_i w_j |z_i - zj| \approx \sum{i} w_i w_j |z_i - z_j|, \quad j = argmax(w_j) \ = wj \sum{i} w_i |z_i - z_j| \ = wj |\sum{i} w_i zi - \sum{i} w_i z_j| \end{matrix} $$
Using the definitions $\text{Depth} = \sum_{i} w_i zi$ and $\text{Opacity} = \sum{i} w_i$, the simplified distortion loss becomes:
$$ w_j |\text{Depth} - \text{Opacity} z_j| $$
# Install GauStudio for training and mesh extraction
%cd /content
!rm -r /content/gaustudio
!pip install pip install -q plyfile torch torchvision tqdm opencv-python-headless omegaconf einops kiui scipy pycolmap==0.4.0 vdbfusion kornia trimesh
!git clone --recursive https://github.com/GAP-LAB-CUHK-SZ/gaustudio.git
%cd gaustudio/submodules/gaustudio-diff-gaussian-rasterization
!python setup.py install
%cd ../../
!python setup.py develop
# generate mask
python preprocess_mask.py --data <path to data>
# 2.5DGS training
python train.py -s <path to data> -m output/trained_result
# 2.5DGS training with normal prior
python train.py -s <path to data> -m output/trained_result --w_normal_prior
# 2.5DGS training with mask
python train.py -s <path to data> -m output/trained_result --w_mask #make sure that `masks` dir exists under the data folder
# naive 2.5DGS training without extra regularization
python train.py -s <path to data> -m output/trained_result --lambda_normal_consistency 0. --lambda_depth_distortion 0.
The results will be saved in output/trained_result/point_cloud/iteration_{xxxx}/point_cloud.ply
.
Download preprocessed DTU data provided by NeuS
The data is organized as follows:
<model_id>
|-- cameras_xxx.npz # camera parameters
|-- image
|-- 000000.png # target image for each view
|-- 000001.png
...
|-- mask
|-- 000000.png # target mask each view (For unmasked setting, set all pixels as 255)
|-- 000001.png
...
# 2.5DGS training
pythont train.py --dataset neus -s <path to DTU data>/<model_id> -m output/DTU-neus/<model_id>
# e.g.
python train.py --dataset neus -s ./data/DTU-neus/dtu_scan105 -m output/DTU-neus/dtu_scan105
# 2.5DGS training with mask
pythont train.py --dataset neus -s <path to DTU data>/<model_id> -m output/DTU-neus-w_mask/<model_id> --w_mask
# e.g.
python train.py --dataset neus -s ./data/DTU-neus/dtu_scan105 -m output/DTU-neus-w_mask/dtu_scan105 --w_mask
Download original BlendedMVS data which is in MVSNet input format. The data is organized as follows:
<model_id>
├── blended_images
│ ├── 00000000.jpg
│ ├── 00000000_masked.jpg
│ ├── 00000001.jpg
│ ├── 00000001_masked.jpg
│ └── ...
├── cams
│ ├── pair.txt
│ ├── 00000000_cam.txt
│ ├── 00000001_cam.txt
│ └── ...
└── rendered_depth_maps
├── 00000000.pfm
├── 00000001.pfm
└── ...
# 2.5DGS training
pythont train.py --dataset mvsnet -s <path to BlendedMVS data>/<model_id> -m output/BlendedMVS/<model_id>
# e.g.
python train.py --dataset mvsnet -s ./data/BlendedMVS/5a4a38dad38c8a075495b5d2 -m output/BlendedMVS/5a4a38dad38c8a075495b5d2
Download original MobileBrick data. The data is organized as follows:
SEQUENCE_NAME
├── arkit_depth (the confidence and depth maps provided by ARKit)
| ├── 000000_conf.png
| ├── 000000.png
| ├── ...
├── gt_depth (The high-resolution depth maps projected from the aligned GT shape)
| ├── 000000.png
| ├── ...
├── image (the RGB images)
| ├── 000000.jpg
| ├── ...
├── mask (object foreground mask projected from the aligned GT shape)
| ├── 000000.png
| ├── ...
├── intrinsic (3x3 intrinsic matrix of each image)
| ├── 000000.txt
| ├── ...
├── pose (4x4 transformation matrix from camera to world of each image)
| ├── 000000.txt
| ├── ...
├── mesh
| ├── gt_mesh.ply
├── visibility_mask.npy (the visibility mask to be used for evaluation)
├── cameras.npz (processed camera poses using the format of NeuS)
# 2.5DGS training
pythont train.py --dataset mobilebrick -s <path to MobileBrick data>/<model_id> -m output/MobileBrick/<model_id>
# e.g.
python train.py --dataset mobilebrick -s ./data/MobileBrick/test/aston -m output/MobileBrick/aston
# 2.5DGS training with mask
pythont train.py --dataset mobilebrick -s <path to MobileBrick data>/<model_id> -m output/MobileBrick-w_mask/<model_id> --w_mask
# e.g.
python train.py --dataset mobilebrick -s ./data/MobileBrick/test/aston -m output/MobileBrick-w_mask/aston --w_mask
gs-extract-mesh -m output/trained_result -o output/trained_result
You may be interested in checking out the following repositories related to 2D Gaussian splatting and surfel representations:
If you found this library useful for your research, please consider citing:
@article{ye2024gaustudio,
title={GauStudio: A Modular Framework for 3D Gaussian Splatting and Beyond},
author={Ye, Chongjie and Nie, Yinyu and Chang, Jiahao and Chen, Yuantao and Zhi, Yihao and Han, Xiaoguang},
journal={arXiv preprint arXiv:2403.19632},
year={2024}
}
@article{huang20242d,
title={2D Gaussian Splatting for Geometrically Accurate Radiance Fields},
author={Huang, Binbin and Yu, Zehao and Chen, Anpei and Geiger, Andreas and Gao, Shenghua},
journal={arXiv preprint arXiv:2403.17888},
year={2024}
}
This project is licensed under the Gaussian-Splatting License - see the LICENSE file for details.