damian0815 / compel

A prompting enhancement library for transformers-type text embedding systems
MIT License
525 stars 47 forks source link

Inconsistent results compared with A1111 WebUI #16

Open duongna21 opened 1 year ago

duongna21 commented 1 year ago

Hi @damian0815. Thank you for your great work here!

Could you please take a look at my comment describing the weird results I obtained using diffusers + compel in case of long weighted prompts?

Thank you!

damian0815 commented 1 year ago

your porny prompt will work better if you convert the weighting syntax to work with compel - check the link on the readme.

duongna21 commented 1 year ago

@damian0815 Thank you for the reply.

  1. The "porny"ness is in negative prompt as you can see. I deeply apologize for not reading the negative prompt carefully when I took it from civit. I just went through the (positive) prompt and found it perfectly polite. The output image is also not pornographic at all.
  2. I did convert the prompts to compel syntax as described in the comment but the result looks bad.
damian0815 commented 1 year ago

it's not expected that auto111 weigthed prompts will "just work" in invoke - compel is written without reference to auto's weighting code, i don't know exactly how the auto weighting works but i do know it works differently in at least one particular aspect. so even if you convert all the (()) to ++ it is still likely to look different.

burgalon commented 1 year ago

@damian0815 i think this is a more general problem we’re seeing where compel compiled prompts tend to have an over saturation artifact

damian0815 commented 1 year ago

i reopened the issue - can you demonstrate it with a simple prompt like a cat playing with a ball in the forest? it's hard to tell what's going on with the typical word-salad prompts from civitai or whatever..

damian0815 commented 1 year ago

actually now that i think about it the problem here is almost certainly that auto111 is doing karras scheduling and diffusers is not

BEpresent commented 1 year ago

Also observing a difference in results between auto1111 and compel prompts.

Here a prompt example as taken from civitai with the Deliberate model , sampler is Euler a in all cases, with the same seed and settings. Findings below. It seems that either a complex syntax like ([tail | detailed wire]:1.3) causes problems or just the colon syntax like this(deformed, distorted, disfigured:1.3) ? I would say removing all parentheses make the results look much better than using the auto1111 syntax as is. Is there a way to parse /switch between the different syntax methods ?

positive prompt: a cute kitten made out of metal, (cyborg:1.1), ([tail | detailed wire]:1.3), (intricate details), hdr, (intricate details, hyperdetailed:1.2), cinematic shot, vignette, centered

negative prompt: (deformed, distorted, disfigured:1.3), poorly drawn, bad anatomy, wrong anatomy, extra limb, missing limb, floating limbs, (mutated hands and fingers:1.4), disconnected limbs, mutation, mutated, ugly, disgusting, blurry, amputation, flowers, human, man, woman

Auto1111 auto1111_cat

Same prompt as above with Diffusers + compel

output_test_Parentheses

Prompt as above with Diffusers + compel with all parentheses and weightings removed:

a cute kitten made out of metal, cyborg, tail detailed wire, intricate details, hdr, intricate details, hyperdetailed, cinematic shot, vignette, centered

deformed, distorted, disfigured, poorly drawn, bad anatomy, wrong anatomy, extra limb, missing limb, floating limbs, mutated hands and fingers, disconnected limbs, mutation, mutated, ugly, disgusting, blurry, amputation, flowers, human, man, woman

output_test_no_parentheses

Prompt as above with Diffusers + compel with parantheses and + weightings in prompt:

a cute kitten made out of metal, cyborg+, tail++, detailed wire++, (intricate details)++, hdr, intricate details, hyperdetailed, cinematic shot, vignette, centered

(deformed, distorted, disfigured)+++, poorly drawn, bad anatomy, wrong anatomy, extra limb, missing limb, floating limbs, (mutated hands and fingers)++++, disconnected limbs, mutation, mutated, ugly, disgusting, blurry, amputation, flowers, human, man, woman

output_test_parentheses_with_plus

damian0815 commented 1 year ago

there isn’t going to be a 1:1 comparison possible when weighting is involved - if you need an interactive editor that uses Compel syntax i suggest InvokeAI.

if the question is about replicating unweighted prompts, check if the quality difference persists at ddim 40 steps, with karras scheduling disabled.

also please understand that it’s not really reasonable to expect exactly equivalent results when switching back-ends - tho again if you need a minimal change when switching from web ui to compel i’d suggest using Invoke, which runs a diffusion pipeline that is much closer to what you’ll get with compel+diffusers than auto11

BEpresent commented 1 year ago

there isn’t going to be a 1:1 comparison possible when weighting is involved - if you need an interactive editor that uses Compel syntax i suggest InvokeAI.

if the question is about replicating unweighted prompts, check if the quality difference persists at ddim 40 steps, with karras scheduling disabled.

also please understand that it’s not really reasonable to expect exactly equivalent results when switching back-ends - tho again if you need a minimal change when switching from web ui to compel i’d suggest using Invoke, which runs a diffusion pipeline that is much closer to what you’ll get with compel+diffusers than auto11

Right , I would not expect an exact copy, I just noticed that a syntax such as (cyborg:1.1) does not seem to work. I think for many it would be enough if there is an automatic conversion so that something 'nice' comes out of it afterwards.

damian0815 commented 1 year ago

you are welcome to submit a conversion script and offer to maintain it. i'm very bad at regexes and i have absolutely zero desire to maintain something that can parse the horrendous mess that is auto11's syntax

BEpresent commented 1 year ago

i'm very bad at regexes

Same - Turns out LLMs are pretty good at them. I assume parentheses ( ) would work, but a syntax such as this (cyborg:1.1) wouldn't, so it would need to be turned into something like (cyborg)1.1. Also brackets such as [] wouldn't work. Below some scripts I use to parse the A1111 syntax. I am not sure if this is the best way to do it, but it's what I'm using currently to convert into a Compel compatible format.

May be someone finds this useful to include in their own custom scripts:

        # Find and replace all instances of the colon format with the desired format
        converted_string = re.sub(r'\(([^:]+):([\d.]+)\)', r'(\1)\2', input_string)

        # Find and replace square brackets with round brackets and assign weight
        converted_string = re.sub(r'\[([^:\]]+)\]', r'(\1)0.909090909', converted_string)

        # Handle the general case of [x:number] and convert it to (x)0.9
        converted_string = re.sub(r'\[([^:]+):[\d.]+\]', r'(\1)0.9', converted_string)
damian0815 commented 1 year ago

here's a solution somebody posted on the invokeai discord: https://sd.reashetyr.software/prompt-converter

WudiJoey commented 1 year ago

here's a solution somebody posted on the invokeai discord: https://sd.reashetyr.software/prompt-converter

Hi, guys, I struggle with the same prooblem, even worse I get the low resolution images. huggingface/diffusers#2431 (comment)

The situation I encountered is similar to yours. I found that it is because of the weight in prompt, A1111 supports larger weights, while the weight of compel should not exceed 1.6.

WudiJoey commented 1 year ago

您好,您的来信我已经收到,会尽快回复。