cvlab-columbia / zero123

Zero-1-to-3: Zero-shot One Image to 3D Object (ICCV 2023)
https://zero123.cs.columbia.edu/
MIT License
2.67k stars 192 forks source link

CFG scale(guidance scale) for generation #18

Closed CiaoHe closed 1 year ago

CiaoHe commented 1 year ago

Hi, when playing with the demo live, I found increase the default CFG(scale) will enhance the generated samples quality (less variance and shaper, like increase it to 10+) image

Just curious about this

ruoshiliu commented 1 year ago

Yes, that’s an expected behavior of tuning classifier free guidance scale of diffusion models. Higher cfg scale produces less diversity and higher consistency with input observation, which is not necessarily a good thing depending on application. For single view novel view synthesis, which is a severely under constrained task, similar to text to image generation, diffusion model as a probabilistic image to image translation model is particularly apt as an architecture.