klimaleksus / stable-diffusion-webui-embedding-merge

Extension for AUTOMATIC1111/stable-diffusion-webui for creating and merging Textual Inversion embeddings at runtime from string literals.
The Unlicense
110 stars 10 forks source link

Inline em fails in XYZ #10

Closed miasik closed 10 months ago

miasik commented 10 months ago

I get different pictures for the same settings using inline EM and XYZ plot. How to reproduce: Write a prompt with inline EM. My example is: <'Tzuyu' + 'Son Ye Jin' + 'Bae Suzy'> 984747234-36-DPM++ 3M SDE Karras-103154_473983

Run XYZ plot with any useless parameter. I used "hires upscaler" with disabled hires.fix What's expected: all the pictures are the same What I get now: only the first image looks like the image without XYZ. All other images looks the same but different from the first one. 984747234-36-DPM++ 3M SDE Karras-103310_168499

aleksusklim commented 10 months ago

Confirmed!

This is a side-effect of canceling the processing when my mark in Generate Info was found… I think I should reprocess everything always, but ignoring my own temporary embeddings in texts.

I wonder why images are different anyway? They should be identical, and I didn't yet understand, what is actually being generated. It is not "as if EM was disabled"! Hopefully I can fix this.

aleksusklim commented 10 months ago

Wait, completely disabling Embedding Merge extension indeed gives those results as in X/Y/Z Plot. But when I chose {'text'} (or <'text'>) syntax I was sure that {' and { ' (or <' and < ') and tokenizing equally!

Turns out <' is 27, 262 while < ' is 283, 262 – and that has a huge difference for the final picture! Now it seems impossible to type literal <' when EM is active, because I though you may just put a space there and it simply disables EM without changing the prompt parsing… (In reality while the space is not representing a token by itself, it does not matter only for real TI embedding detection boundaries)

I will assume that nobody needs literal exact <'; and also that <'EM_x'> ought to be not used and thus won't "work": e.g. currently <'EM_1'> gives text EM_1 in place of itself, I would have to make it give <'EM_1'> (literal copy, non-processed) to fix the main bug.

miasik commented 10 months ago

IDK if it helps but https://github.com/hako-mikan/sd-webui-cd-tuner allows using <>in promts as its control string and it survives in hires.fix. Example: <cdt:d1=6;d2=4;hrs=1;hd1=6;hd2=8;st1=6;st2=1>

aleksusklim commented 10 months ago

Nothing starting from <term: conflicts with EM because my code is specifically looking for <' or {' But if some other extension demands just angle brackets (triggering its processing on <) – then it might clash.

For the time being, I will test how to reliably catch my own faked embeddings names so that EM function could be applied to its own output and "do nothing more" with it (not changing text and not creating more embeddings)

aleksusklim commented 10 months ago

Should be fixed now!

To replicate wrong images (that were in XYZ after the first one) without disabling Embedding Merge, you may replace <' parts with <'',27,262>

miasik commented 10 months ago

To replicate wrong images (that were in XYZ after the first one) without disabling Embedding Merge, you may replace <' parts with <'',27,262>

Sorry, hard to get. Let's take negative: <'low' + 'bad'> <'quality' + 'resolution'> How my final prompt should look like?

aleksusklim commented 10 months ago

How my final prompt should look like?

Nothing changes, everything is working as-is.

I just said how to replicate wrongs ones that you saw in yours X/Y/Z previously. In your case, for <'Tzuyu' + 'Son Ye Jin' + 'Bae Suzy'> it would be <'',27,262>Tzuyu' + 'Son Ye Jin' + 'Bae Suzy'> – but there is not much of profit, because it simulates as if EM was disabled completely.