theEricMa / ScaleDreamer

This is the official repository for ScaleDreamer: Scalable Text-to-3D Synthesis with Asynchronous Score Distillation [ECCV2024]
Apache License 2.0
47 stars 7 forks source link
3d-generation 3d-models aigc diffusers dreamfusion nerf stable-diffusion text-to-3d

ScaleDreamer: Scalable Text-to-3D Synthesis with Asynchronous Score Distillation

Zhiyuan MaYuxiang WeiYabin ZhangXiangyu ZhuZhen LeiLei Zhang
[[Paper]](https://arxiv.org/pdf/2407.02040) | [[Project Page]](https://sites.google.com/view/scaledreamer-release/) ---

đŸ”Ĩ News

⚙ī¸ Dependencies and Installation

Follow threestudio to set up the conda environment, or use our provided instructions as below. - Create a virtual environment: ```sh conda create -n scaledreamer python=3.10 conda activate scaledreamer ``` - Install PyTorch ```sh # Prefer using the latest version of CUDA and PyTorch conda install pytorch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 pytorch-cuda=12.1 -c pytorch -c nvidia ``` - (Optional, Recommended) Install [xFormers](https://github.com/facebookresearch/xformers) for attention acceleration. ```sh conda install xformers -c xformers ``` - (Optional, Recommended) Install ninja to speed up the compilation of CUDA extensions: ```sh pip install ninja ``` - Install major dependencies: ```sh pip install -r requirements.txt ``` - Install [iNGP](https://github.com/NVlabs/instant-ngp) and [NerfAcc](https://github.com/nerfstudio-project/nerfacc): ```sh export PATH="/usr/local/cuda/bin:$PATH" export LD_LIBRARY_PATH="/usr/local/cuda/lib64:$LD_LIBRARY_PATH" pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch pip install git+https://github.com/KAIR-BAIR/nerfacc.git@v0.5.2 ``` If you encounter errors while installing iNGP, it is recommended to check your gcc version. Follow these instructions to change the gcc version within your conda environment. Then return to the repository directory to install iNGP and NerfAcc âŦ†ī¸ again. ```sh conda install -c conda-forge gxx=9.5.0 cd $CONDA_PREFIX/lib ln -s /usr/lib/x86_64-linux-gnu/libcuda.so ./ cd ```
Download 2D Diffusion Priors. - Save [SD-v2.1-base](https://huggingface.co/stabilityai/stable-diffusion-2-1-base) and [MVDream](https://mv-dream.github.io/) to the local directory `pretrained`. ``` python scripts/download_pretrained_models.py ```

🌈 Prompt-Specific 3D Generation

🚀 Prompt-Amortized 3D Generator Tranining

The following 3D generator architectures are available:

Network Description File
Hyper-iNGP iNGP with text-conditioned linear layers,adopted from ATT3D geometry, background
3DConv-net A StyleGAN generator that outputs voxels with 3D convolution, code adopted from CC3D geometry, architecture
Triplane-Transformer Transformer-based 3D Generator, with Triplane as the output structure, adopted from LRM geometry, architecture

The following corpus datasets are available:

Corpus Description File
MG15 15 text pormpts from Magic3D project page json
DF415 415 text pormpts from DreamFusion project page json
AT2520 2520 text pormpts from ATT3D experiments json
DL17k 17k text pormpts from Instant3D release json
CP100k 110k text pormpts from Cap3D dataset json

Run the following script to start training

sh scripts/multi-prompt-benchmark/asd_sd_hyper_iNGP_MG15.sh
sh scripts/multi-prompt-benchmark/asd_sd_3dconv_net_DF415.sh
sh scripts/multi-prompt-benchmark/asd_sd_3dconv_net_AT2520.sh
sh scripts/multi-prompt-benchmark/asd_mv_triplane_transformer_DL17k.sh
scripts/multi-prompt-benchmark/asd_sd_3dconv_net_CP100k.sh

📷 Prompt-Amortized 3D Generator Evaluation

Create a directory to save the checkpoints

mkdir pretrained/3d_checkpoints

The checkpoints of the âŦ†ī¸ experiments are available. Save the corresponding .pth file to 3d_checkpoint, then run the scripts as below.

sh scripts/multi_prompts_benchmark_evaluation/asd_sd_hyper_iNGP_MG15.sh
sh scripts/multi_prompts_benchmark_evaluation/asd_sd_3dconv_net_DF415.sh
sh scripts/multi_prompts_benchmark_evaluation/asd_sd_3dconv_net_AT2520.sh
sh scripts/multi_prompts_benchmark_evaluation/asd_mv_triplane_transformer_DL17k.sh
sh scripts/multi_prompts_benchmark_evaluation/asd_sd_3dconv_net_CP100k.sh

The rendered images and videos are saved in outputs/<experiment_name>/save/<num_iter> directory. Compute the metrics with CLIP via

python evaluation/CLIP/evaluation_amortized.py --result_dir <video_dir>

🕹ī¸ Create Your Own Modules

3D Generator

  1. Place the code in custom/amortized/models/geometry, check out the other code in that directory for reference.
  2. Update your in custom/amortized/models/geometry/__init__.py
  3. Create your own config file, type in your registered module name in the system.geometry_type argument, check out the other code in the configs/multi-prompt_benchmark directory for reference.

2D Diffusion Guidance

  1. Put your code in threestudio/models/guidance, take a look at the other code in that directory or other guidance for reference.
  2. Update your in threestudio/models/guidance/__init__.py
  3. Create your own config file, type in your registered module name in the system.guidance_type argument, take a look at the other code in the configs/multi-prompt_benchmark directory for reference.

Text corpus

  1. Create a JSON file that lists the training, validation, and test text prompts in the load directory
  2. Enter the name of this JSON file into the system.prompt_processor.prompt_library argument to set up the corpus, take other commands in the scripts directory for reference

You can also add your modules for data, renderer, prompt_processor, etc.

📖 Citation

If you find this paper helpful, please cite

@article{ma2024scaledreamer,
  title={ScaleDreamer: Scalable Text-to-3D Synthesis with Asynchronous Score Distillation},
  author={Ma, Zhiyuan and Wei, Yuxiang and Zhang, Yabin and Zhu, Xiangyu and Lei, Zhen and Zhang, Lei},
  journal={arXiv preprint arXiv:2407.02040},
  year={2024}
}

🙏 Acknowledgement