Open ArielReplicate opened 1 year ago
hi @ArielReplicate, the scale
parameter essentially controls the fidelity of the generated image to the target prompt, i.e. a higher value of scale
makes the translated image more resembling of the target prompt. Higher values of scale
are mostly necessary for translating real guidance images where the DDIM-inverted noise is restrictive and challenging to deviate from. Such cases mostly occur for primitive and textureless guidance images (e.g. segmentation masks, silhouettes, etc.). Note that too high values of scale
might cause undesirable artifacts, such as over-saturated colors, so it should be balanced accordingly (we generally found scale ∈ [10, 15]
to give a good tradeoff).
For deviating from the guidance image content, you can also use the negative prompt parameters, which in a sense have the opposite effect from scale
as they indicate what the translated image should deviate from rather than to be faithful to. Note that the negative prompt can describe only a part of the guidance content that you wish to deviate from and doesn't have to describe the guidance image as a whole.
Hi,
I'm trying to understand the way the scale parameter affects the translation output. The only information I found at the here was in the config file: "unconditional guidance scale. Note that a higher value encourages deviation from the source image"
Would you mind explaining how this parameter affect the translation and how it should be combined with other structure preserving control parameters like _'feature_injectionthreshold' and the negative prompt parameters?