Neural implicit fields are powerful for representing 3D scenes and generating high-quality novel views, but it re mains challenging to use such implicit representations for creating a 3D human avatar with a specific identity and artistic style that can be easily animated. Our proposed method, AvatarCraft, addresses this challenge by using diffusion models to guide the learning of geometry and texture for a neural avatar based on a single text prompt. We care fully design the optimization of neural implicit fields using diffusion models, including a coarse-to-fine multi-bounding box training strategy, shape regularization, and diffusion- based constraints, to produce high-quality geometry and texture. Additionally, we make the human avatar animatable by deforming the neural implicit field with an explicit warping field that maps the target human mesh to a template human mesh, both represented using parametric human models. This simplifies the animation and reshaping of the generated avatar by controlling pose and shape parameters. Extensive experiments on various text descriptions show that AvatarCraft is effective and robust in creating human avatars and rendering novel views, poses, and shapes.
[**Update**] :fire: Jun 2023: Code for avatar creation and articulation is released. :fire: Jul 2023: Our paper is accepted to present at ICCV 2023!!! ## Environment Setup Use Conda to create a virtual environment and install dependencies: ``` conda create -n avatar python=3.7 -y; conda activate avatar; conda install pytorch==1.8.0 torchvision==0.9.0 cudatoolkit=10.2 -c pytorch; # For GPU with CUDA version 11.x, please use: # conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge conda install -c fvcore -c iopath -c conda-forge fvcore iopath; conda install -c bottler nvidiacub; conda install pytorch3d -c pytorch3d; conda install -c conda-forge igl; pip install opencv-python joblib open3d imageio==2.25.0 tensorboardX chumpy scikit-image ipython matplotlib einops trimesh pymcubes; pip install diffusers==0.16.1 transformers==4.29.1; mkdir data mkdir ckpts ``` ## Data Setup [**Compulsory**] Register and download the [SMPL](https://smpl.is.tue.mpg.de/) model, put it under `./data` path: ``` data/ |-- smplx/ |-- smpl_uv.obj |-- smpl/ |-- SMPL_NEUTRAL.pkl ``` [**Compulsory**] Create an access token on [Huggingface](https://huggingface.co/settings/tokens) for accessing the pretrained diffusion model. Use the token to login with the following command: ``` huggingface-cli login ``` [**Compulsory**] Download our pretrained [bare SMPL ckpt](https://drive.google.com/file/d/1GRfc9fbiBLTqEP6dURaReyERT-Tzk127/view?usp=share_link), and put it into `./ckpts` path. [**Optional**] If you would like to animate the generated avatar, you need a sequence of SMPL poses. In our project, we use [AMASS](https://amass.is.tue.mpg.de/) dataset (SMPL+H) to generate the poses. Specifically, we have used the SFU subset in our paper and video. We can't redistribute the dataset, but we provide a [script](utils/convert_amass.py) to for you to convert the AMASS format to ours. You need to download and process by yourself. Alternatively, you may also use your own SMPL pose sequence. ## Avatar Creation
Use the following command to create an avatar with text prompt. We test our code on A100-80G and RTX3090-24G, if you encounter OOM error, please reduce the batch size. For the prompt, our suggestion is that you provide as detailed description as possible. Otherwise you may not get reasonable result due to the high variance of SDS loss.
Note: for the first time running, it will take a while to compile the CUDA operators, do not kill the process. ``` python stylize.py --weights_path "ckpts/bare_smpl.pth.tar" --tgt_text "Hulk, photorealistic style" --exp_name "hulk" --batch_size 4096 ``` After creation, you can render the canonical avatar with the following command. If you don't want to train your own, you can also use our generated [avatars](https://drive.google.com/drive/folders/1t31_QK6mV9dJyCRc4VMLNJ6q0c3NQX7Q?usp=share_link): ``` python render_canonical.py --weights_path path/to/generated_avatar.pth.tar --exp_name "hulk" --render_h 256 --render_w 256 ``` ## Avatar Articulation