VAST-AI-Research / TriplaneGaussian

TriplaneGaussian: A new hybrid representation for single-view 3D reconstruction.
https://zouzx.github.io/TriplaneGaussian/
Apache License 2.0
713 stars 49 forks source link
# Triplane Meets Gaussian Splatting:
Fast and Generalizable Single-View 3D Reconstruction with Transformers

TGS enables fast reconstruction from single-view images in a few seconds based on a hybrid Triplane-Gaussian 3D representation.

teaser


Official implementation of Triplane Meets Gaussian Splatting: Fast and Generalizable Single-View 3D Reconstruction with Transformers.

⭐️ Key Features

🚩 News

💻 Examples

Please try our model online in the Gradio demo on Hugging Face Space.

https://github.com/VAST-AI-Research/TriplaneGaussian/assets/25632410/706da1b8-0b59-462a-b6e4-4a3316f9e909

Results on Images Generated by Midjourney

https://github.com/VAST-AI-Research/TriplaneGaussian/assets/25632410/d27451e7-d298-4b6b-9dfe-f7927847167d

Results on Captured Real-world Images

https://github.com/VAST-AI-Research/TriplaneGaussian/assets/25632410/1efe39d4-fcf1-4904-bf80-097796ca18e8

🏁 Quick Start

Colab Demo

Run TGS in Google Colab: Open In Colab

Installation

Download the Pretrained Model

We offer a pretrained checkpoint available for download from Hugging Face; download the checkpoint and place it in the folder checkpoints.

from huggingface_hub import hf_hub_download
MODEL_CKPT_PATH = hf_hub_download(repo_id="VAST-AI/TriplaneGaussian", local_dir="./checkpoints", filename="model_lvis_rel.ckpt", repo_type="model")

Please note this model is only trained on Objaverse-LVIS dataset (~45K 3D models). Models with more parameters (e.g., deeper layers, more feature channels) and trained on larger datasets (e.g., the full Objaverse dataset) should achieve stronger performance, and we will explore it in the future.

Inference

Use the following command to reconstruct a 3DGS model from a single image. Please update data.image_list to some specific list of image paths.

python infer.py --config config.yaml data.image_list=[path/to/image1,] --image_preprocess --cam_dist ${cam_dist}
# e.g. python infer.py --config config.yaml data.image_list=[example_images/a_pikachu_with_smily_face.webp,] --image_preprocess

If you wish to remove the background from the input image, you can turn on the --image_preprocess argument in the command. Before that, please download the SAM checkpoint and place it in checkpoints folder as well.

--cam_dist is used to set camera distance parameter, which denotes distance between camera center and scene center and is default as 1.9.

Finally, the script will save a video (.mp4) and a 3DGS (.ply) file. The format of .ply file is consistent with graphdeco-inria/gaussian-splatting, making it compatible with other visualization tools such as gsplat.js.

Local Gradio Demo

Our Gradio demo depends on a custom Gradio component for 3DGS rendering. Please clone this component first:

git clone https://github.com/dylanebert/gradio-splatting.git gradio_splatting

Then, you can launch the Gradio demo locally by:

python gradio_app.py

📝 Some Tips

Acknowledgements

Citation

If you find this work helpful, please consider citing our paper:

@article{zou2023triplane,
  title={Triplane Meets Gaussian Splatting: Fast and Generalizable Single-View 3D Reconstruction with Transformers},
  author={Zou, Zi-Xin and Yu, Zhipeng and Guo, Yuan-Chen and Li, Yangguang and Liang, Ding and Cao, Yan-Pei and Zhang, Song-Hai},
  journal={arXiv preprint arXiv:2312.09147},
  year={2023}
}