SUDO-AI-3D / zero123plus

Code repository for Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model.
Apache License 2.0
1.56k stars 108 forks source link

Training details #55

Closed fradif96 closed 6 months ago

fradif96 commented 6 months ago

Hello, thank you for the great work. I would kindly ask you for some training details. In particular:

1) How many/which type of GPUs did you use for training? How much did it take to train? 2) Which is the exact SD version checkpoint that has been fine-tuned? It is known from the report that it is the SD 2 v-model (Sec. 2.5) but it would be helpful to have the link to the exact SD checkpoint from which the fine-tuning has been initiated.

Thanks in advance

eliphatfs commented 6 months ago
  1. 24 A100s for 2 days for phase 1 and 16 A100s for 3.5 days for phase 2 training.
  2. I think there is only one model called SD 2 and is an v-model: https://huggingface.co/stabilityai/stable-diffusion-2
netswift2905 commented 6 months ago

Really awesome. Do you plan to release the training code?

eliphatfs commented 6 months ago

10. We are planning to release controlnet training code.

shinetzh commented 1 month ago
  1. 24 A100s for 2 days for phase 1 and 16 A100s for 3.5 days for phase 2 training.
  2. I think there is only one model called SD 2 and is an v-model: https://huggingface.co/stabilityai/stable-diffusion-2

great work! what batch size did you use for training, and how many epoch?

shinetzh commented 1 month ago

and, the reference latent is scaled by a factor of 5 in your paper, but i did not find the factor 5 used in inference code.