ashawkey / stable-dreamfusion

Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.
Apache License 2.0
7.99k stars 708 forks source link

Design of Image-to-3D pipeline #319

Open XinyangHan opened 1 year ago

XinyangHan commented 1 year ago

The image to 3d demo results look surprisingly good! May I ask if there are detailed documentation that illustrates how it works?

jameskuma commented 1 year ago

I also want to learn about more details about Image-to-3D generation! I finetune Zero123 with some images and get the improvement on novel view synthesis. However, the results are quite similar to the original zero123 ckpt when I run with the finetuned ckpt.
The commmand I use is:

# Init
CUDA_VISIBLE_DEVICES=${{GPU_ID}} python main.py -O \
    --image path_to_img_raba \
    --workspace logs/init \
    --iters 5000 \
    --batch_size 2 \
    --h 96 --w 96 \
    --guidance_scale 5 \
    --eval_interval 50 \
    --zero123_ckpt path_to_finetuned_ckpt \
    --zero123_config path_to_my_yaml
# Finetune
CUDA_VISIBLE_DEVICES=${{GPU_ID}} python main.py -O \
    --image path_to_img_raba \
   --workspace logs/refine \
    --iters 10000 \
    --batch_size 2 \
    --guidance_scale 5 \
    --h 96 --w 96 \
    --eval_interval 50 \
    --dmtet --init_with path_to_init_ckpt

This repo is sooooooo AWESOME! Could you please share more detailed documentations about 2D-to-3D generation with only Zero123 guidance?