ljleb / sd-webui-neutral-prompt

Collision-free AND keywords for a1111 webui!
MIT License
187 stars 13 forks source link

non-top-level CompositePrompt with conciliation=None is not directly influenced by conciliated children #59

Closed wbclark closed 7 months ago

wbclark commented 7 months ago

Hey @ljleb , thanks for the updates!

Trying out the new code, I found another issue. Based on my reading of the code, the parsing looks correct but I think the issue is in cfg_denoiser_hijack.py. Here are some examples demonstrating what I think is happening, starting with a correctly working example

  1. (Working correctly)

fire AND_PERP [ ice AND_PERP outer space AND_PERP high-definition photography ]

this is now parsed as

CompositePrompt(weight=1.0, conciliation=None, children=[
  LeafPrompt(weight=1.0, conciliation=None, prompt='fire '),
  CompositePrompt(weight=1.0, conciliation=<ConciliationStrategy.PERPENDICULAR: 'AND_PERP'>, children=[
    LeafPrompt(weight=1.0, conciliation=None, prompt=' ice '),
    LeafPrompt(weight=1.0, conciliation=<ConciliationStrategy.PERPENDICULAR: 'AND_PERP'>, prompt=' outer space '),
    LeafPrompt(weight=1.0, conciliation=<ConciliationStrategy.PERPENDICULAR: 'AND_PERP'>, prompt=' high-definition photography ')
  ])
])

and elements of all four prompts are clearly visible in the resulting image

  1. (Changing first AND_PERP to AND causes the issue)

fire AND [ ice AND_PERP outer space AND_PERP high-definition photography ]

this is now parsed as

CompositePrompt(weight=1.0, conciliation=None, children=[
  LeafPrompt(weight=1.0, conciliation=None, prompt='fire '),
  CompositePrompt(weight=1.0, conciliation=None, children=[
    LeafPrompt(weight=1.0, conciliation=None, prompt=' ice '),
    LeafPrompt(weight=1.0, conciliation=<ConciliationStrategy.PERPENDICULAR: 'AND_PERP'>, prompt=' outer space '),
    LeafPrompt(weight=1.0, conciliation=<ConciliationStrategy.PERPENDICULAR: 'AND_PERP'>, prompt=' high-definition photography ')
  ])
])

the resulting image contains no hint of outer space or high-definition photography, even if the weights for those prompts are made unreasonably high.

in fact, with all args the same (seed, etc), the resulting image is identical to just fire AND ice

  1. (Changing the way the original prompt is grouped also causes the issue)

[ fire AND_PERP ice AND_PERP outer space ] AND_PERP high-definition photography

this is now parsed as

CompositePrompt(weight=1.0, conciliation=None, children=[
  CompositePrompt(weight=1.0, conciliation=None, children=[
    LeafPrompt(weight=1.0, conciliation=None, prompt=' fire '),
    LeafPrompt(weight=1.0, conciliation=<ConciliationStrategy.PERPENDICULAR: 'AND_PERP'>, prompt=' ice '),
    LeafPrompt(weight=1.0, conciliation=<ConciliationStrategy.PERPENDICULAR: 'AND_PERP'>, prompt=' outer space ')
  ]),
  LeafPrompt(weight=1.0, conciliation=<ConciliationStrategy.PERPENDICULAR: 'AND_PERP'>, prompt= ' high-definition photography')
])

the resulting image has characteristics of fire and high-definition photography but displays no characteristics of ice or outer space regardless of weights, etc.

keeping the same args, the weights and prompts for ice and outer space do influence the resulting image, but only indirectly via the calculation of the perpendicular cond for high-definition photography. to see how that plays out, try the following variations while keeping all other generation parameters the same:

I. [ fire AND_PERP ice AND_PERP outer space :1000.0 ] AND_PERP high-definition photography II. [ fire AND_PERP ice AND_PERP outer space :-1000.0 ] AND_PERP high-definition photography (different from I.) III. [ fire AND_PERP ice AND_PERP outer space :1000.0 ] AND high-definition photography IV. [ fire AND_PERP ice AND_PERP outer space :-1000.0 ] AND high-definition photography (same(!) as III.) V. [ fire AND_SALT ice AND_SALT outer space ] AND high-definition photography (same(!) as III.) VI. fire AND high-definition photography (same(!) as III.) VII. [ fire AND high-definition photography ] (same as VI, as expected) VIII. [ fire AND_PERP high-definition photography ] (different from VII, but...) IX. fire (same(!) as VIII)

As best as I can tell, the unexpected cases seem to occur exactly when there is CompositePrompt with conciliation=None, other than the top-level-prompt. The conciliated children of that prompt do seem to affect the way that other prompts are conciliated against the composite, but don't seem to be included when the composite itself is conciliated into its own parent.

wbclark commented 7 months ago

The issue seems to be with conciliated children in the non-conciliated, non-top-level composite grouping only.

fire AND_PERP ice works, and so does [ fire AND ice ], but [ fire AND_PERP ice ] does not

fire AND_PERP [ ice AND_PERP jungle ] works but in [ fire AND_PERP ice ] AND_PERP jungle, AND_PERP ice does change how AND_PERP jungle is computed but doesn't actually affect fire. If you change AND_PERP jungle to just AND jungle then the partial effects of AND_PERP ice disappear completely.

I think I understand the issue well enough now to consistently predict what prompts should cause it. I'm almost certain the issue is one of the Visitor classes, I'll try to figure out which.

ljleb commented 7 months ago

Appreciate your very detailed information, this is very useful.

I found a bug related to batch_size, but I haven't taken the time to commit a fix yet. But this is unrelated to what you're describing above I believe.

I tried to double check the implementation, but it's possible I made a mistake when simplifying the visitor code. If you find a fix and find any time for this, I'll be happy to merge your PR. Otherwise I'll seek a fix when I find the time.

ljleb commented 7 months ago

I found the issue. This is caused by get_webui_denoised. The problem is that it only uses the leaf AND prompts to compute this, which is actually wrong for what it's supposed to do. We can't simplify the problem the way I did it. Instead, we have to fully compute each direct child of the root composite prompt that has conciliation=None and ignore the other branches. Then, call the original function with these fully computed denoised tensors so that if anyone else patches the same function as we did, then we are using their code.

ljleb commented 7 months ago

Thanks for your contribution, the issue should be resolved now.

Note that there are currently slight precision errors introduced by the delta-space implementation. If you remove the extension, you should get just slightly better images on normal prompts.

Unless it is necessary, the code shouldn't rely on diff space. Will fix later.