Design of Image-to-3D pipeline

I also want to learn about more details about Image-to-3D generation! I finetune Zero123 with some images and get the improvement on novel view synthesis. However, the results are quite similar to the original zero123 ckpt when I run with the finetuned ckpt.
The commmand I use is:

# Init
CUDA_VISIBLE_DEVICES=${{GPU_ID}} python main.py -O \
    --image path_to_img_raba \
    --workspace logs/init \
    --iters 5000 \
    --batch_size 2 \
    --h 96 --w 96 \
    --guidance_scale 5 \
    --eval_interval 50 \
    --zero123_ckpt path_to_finetuned_ckpt \
    --zero123_config path_to_my_yaml
# Finetune
CUDA_VISIBLE_DEVICES=${{GPU_ID}} python main.py -O \
    --image path_to_img_raba \
   --workspace logs/refine \
    --iters 10000 \
    --batch_size 2 \
    --guidance_scale 5 \
    --h 96 --w 96 \
    --eval_interval 50 \
    --dmtet --init_with path_to_init_ckpt

This repo is sooooooo AWESOME! Could you please share more detailed documentations about 2D-to-3D generation with only Zero123 guidance?

ashawkey / stable-dreamfusion

Design of Image-to-3D pipeline #319