tencent-ailab / IP-Adapter

The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
Apache License 2.0
5.08k stars 331 forks source link

Added missing guidance_scale passing to StableDiffusionXL from IPAdapterFaceIDPlusXL pipeline #270

Closed kovalexal closed 8 months ago

kovalexal commented 8 months ago

Hi @xiaohu2015!

I noticed that there was a missing guidance_scale parameter passing to StableDiffusionXLPipeline from IPAdapterFaceIDPlusXL, so I added it to these pipelines.

xiaohu2015 commented 8 months ago

thanks a lot

Arcitec commented 8 months ago

Interesting and great job discovering this. Any idea what effect this change has on the image results? I'm guessing it meant that it always used the default value for guidance scale. But I am not sure what the guidance scale actually does. :)

xiaohu2015 commented 8 months ago

@Arcitec the default should work, but you can also try other values

Arcitec commented 8 months ago

Thanks. Yeah I guess it's this Stable Diffusion parameter:

https://huggingface.co/docs/diffusers/v0.14.0/en/api/pipelines/stable_diffusion/self_attention_guidance

Default should indeed be fine and we won't see any major improvements with this fix. Good discovery though!

Edit: Looks like guidance_scale might be their term for the famous "CFG" (classifier-free guidance) prompt adherence parameter. If so, I get it now. :P

xiaohu2015 commented 8 months ago

Thanks. Yeah I guess it's this Stable Diffusion parameter:

https://huggingface.co/docs/diffusers/v0.14.0/en/api/pipelines/stable_diffusion/self_attention_guidance

Default should indeed be fine and we won't see any major improvements with this fix. Good discovery though!

Edit: Looks like guidance_scale might be their term for the famous "CFG" (classifier-free guidance) prompt adherence parameter. If so, I get it now. :P

yes, it is the original parameter of sd unet not ip adapter

kovalexal commented 8 months ago

@Arcitec, this parameter influences the whole generation output. It was proposed in Classifier-Free Diffusion Guidance as an approach to increase the quality of output generation.

Simply speaking, this parameters influences on how far your generation would be from the provided negative text prompt.