AUTOMATIC1111 / stable-diffusion-webui

Stable Diffusion web UI
GNU Affero General Public License v3.0
139.78k stars 26.49k forks source link

The same image can no longer be output under the same conditions. #2197

Closed Keigin closed 1 year ago

Keigin commented 1 year ago

I updated last night and this is the problem I encountered.

I copied the condition from PNG Info to txt2img and selected the same model to generate the image, but now I cannot output the same image. I am very troubled because I do not know the cause of this problem, because until now exactly the same image was generated.

Has there been any change in the weighting implications of the prompts or the CFGs? If this is happening only to me, it may be a problem with my environment...

hentailord85ez commented 1 year ago

If your prompt is longer than 75 tokens then definitely yes, proessing has changed.

Keigin commented 1 year ago

Thanks for the reply! Maybe the rules have been changed so that if my input exceeds 75 tokens, instead of the excess being ignored, the overall impact is reduced according to the number of characters over the limit?

hentailord85ez commented 1 year ago

No. Before, if your input exceeded 75 tokens, it didn't really do much. Now it actually has an effect on the image.

Keigin commented 1 year ago

Thank you! I'll check it out ASAP!

Proggle commented 1 year ago

Been reported, happens even with images less than 75 tokens.

https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/2005

Makes the PNG info thing kinda pointless since you can't regen the same image.

Keigin commented 1 year ago

image image image

I see that others have already reported this case. My apologies. But I am relieved that there are others who are the same.

I'll post my output image as well. The left is the new image and the right is the old image. They were output with exactly the same settings and the same model.

I hope the quality of the images is better with the new version, but in my case, there are some undesirable changes.

My memory is a bit fuzzy, but didn't "Stop At last layers of CLIP model" used to have a zero option? I thought I was using it with zero. I don't know if this has anything to do with it.

By the way, every picture was less than 75 tokens.

squishieuwu commented 1 year ago

I updated a few hours ago and also noticed the inability to regenerate images with the same settings (under 75 tokens) does this have anything to do with the NAI sampler parity work thats being done?

output changed

also im a github noob, is theres a way to easily revert back a day or so? i didnt save a backup

hentailord85ez commented 1 year ago

I'll try and see if I can reproduce. Edit: taking quite a while with my slow pc. I'm not noticing any changes. If someone has a prompt with a non-custom-merged model and has this problem, send it and I'll have a look. If you also remember when was the last update that you had before this issue, that would help too.

Keigin commented 1 year ago

Unfortunately, I was using a 50% mixture of the two models. As for the update time, I think it was from a version that was at least 3 or 4 days old, as I think I was [pull] before each startup.

Mirabilis1729 commented 1 year ago

@hentailord85ez

I think the issue lies with how the new code is reading the ( ) and [ ] esp when applied to negative prompts (maybe both). It is either disregarding them or not reading them at all. Whereas yesterday it was working great.

I was trying out the positive and negative prompts in the default model based on this advice : -

https://rentry.co/8vaaa

from one of the Waifu Booba guys (for science obv ¯_(ツ)_/¯ ) and it was yielding amazing results in terms of pretty flawless imagery without the usual SD issues: -

00365-2162382813-Scarlett Johansson portrait, ((large breast)),((toned abs)),(thick thigh) sensual post, smirk, forward facing, fantasy, elegant,-before-face-restoration

However after an update today, things are quite different and a lot of the time the generations seem to ignore some or all of the negatives

02557-2162382813-Scarlett Johansson portrait, ((large breast)),((toned abs)),(thick thigh) sensual post, smirk, forward facing, fantasy, elegant(1)

^same exact settings :|

Hope that helps. Data is embedded in the PNGs.

hentailord85ez commented 1 year ago

If you're using that exact negative prompt in the as is in the rentry; it's the other way around. The prompt changing is expected as it is > 75 tokens.

It wasn't accounting properly for it before yesterday and now it is. That's why my guess is is that now your negative prompt is conflicting with itself, and you're putting too much emphasis on placebo tokens which lessens the effect of everything as a whole.

Try reducing the bracket of everything by 1-2 layers in the negatives.

Mirabilis1729 commented 1 year ago

If you're using that exact negative prompt in the as is in the rentry; it's the other way around. The prompt changing is expected as it is > 75 tokens.

It wasn't accounting properly for it before yesterday and now it is. That's why my guess is is that now your negative prompt is conflicting with itself, and you're putting too much emphasis on placebo tokens which lessens the effect of everything as a whole.

Try reducing the bracket of everything by 1-2 layers in the negatives.

TBF I was just cutting/pasting from the waifu guy and not fussing about the emphasis level, too much versus testing it out but was very impressed with the overall results in terms of the impact it had on reducing issues. However, I'll try stepping down the intensity and seeing if it changes things. However, Nudity was in the negatives and wasn't bracketed and a lot of the later generations were outright ignoring it completely. Even though the program was happily adhering to it before.

Still will run some tests and see what occurs.

Mirabilis1729 commented 1 year ago

@hentailord85ez

OK did a couple of tests

Baseline Original from yesterday first: -

00366-2162382813-Scarlett Johansson portrait, ((large breast)),((toned abs)),(thick thigh) sensual post, smirk, forward facing, fantasy, elegant,

First test. Kept the positive all the positive () in the prompt, but reduced all the negatives down to one bracket each, and removed a couple of repeats: -

195168443-2162382813-Scarlett Johansson portrait, ((large breast)),((toned abs)),(thick thigh) sensual post, smirk, forward facing, fantasy, elegant -Single brackets,

Second test. Kept the positive all the positive () in the prompt, but reduced all the negatives down by one bracket (save the single bracket ones) but everything else: -

195168445-2162382813-Scarlett Johansson portrait, ((large breast)),((toned abs)),(thick thigh) sensual post, smirk, forward facing, fantasy, elegant 1-step back,

Certainly both show improvement on the previous output using the original prompt unaltered today: -

02557-2162382813-Scarlett Johansson portrait, ((large breast)),((toned abs)),(thick thigh) sensual post, smirk, forward facing, fantasy, elegant(1)

However, they definitely are different from the baseline from yesterday both in terms of overall appearance and intensity (most notable with the colour shift with the flowers).

Anyway, I'll try out random seeds using the reduced settings on the default model as well as the custom one and see if it behaves better.

Hope that is helpful

captainweasly commented 1 year ago

Took me a while to narrow it down; the branch version [ti-preprocess] gave me the same direct results under the same settings. Cheers.

onusai commented 1 year ago

I've seen a pretty big drop in aesthetic over the last few days. I narrowed it down to this commit e59c66c0088422b27f64b401ef42c242f836725a

Example images before and after that commit:

Prompt if anyone wants to try reproducing:

1girl armchair bangs bird black eyes black hairband blue shirt blush bobby socks book buttons cat chair closed eyes closed mouth crow frilled shirt collar frilled sleeves frills full body hairband heart heart button long sleeves medium hair nekomata pink hair pink skirt pink socks red footwear shirt sitting skirt sleeping slippers socks third eye white background wide sleeves kaenbyou rin kaenbyou rin cat komeiji satori absurdres highres housulu touhou
Negative prompt: trans dickgirl futa futanari [hat : 0.45] (extra nipple), (extra breast), out of frame, amputee, mutated, mutation, deformed, severed, dismembered, corpse, pubic, real, photograph, poorly drawn, bad anatomy, blur, blurry, lowres, bad hands, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, artist name
Steps: 30, Sampler: Euler a, CFG scale: 7, Seed: 2211058756, Size: 512x768, Model hash: d4af22a3

Model is wd1.3 merged with NAI at 50%, other 2 seeds are 828418570 and 733561261

Mirabilis1729 commented 1 year ago

Definitely notable

MartinCairnsSQL commented 1 year ago

It looks like the code was reverted to use last_hidden_state when not skipping layers in ad3ae441081155dcd4fde805279e5082ca264695 rather than the final_layer_norm method in e59c66c0088422b27f64b401ef42c242f836725a

specblades commented 1 year ago

Im literally using default settings, vanilla 1-4 sd and --xformers arg. I updated the repo a day ago. My result is pretty close, but different. Made in a row, one after the other. Without changing any settings. Little panic and disappointment

UPD/ --xformers couses it, just disable

Look at columns on 1-2 and hand on 3-4 00342-3094012949-a beautiful Greek woman, wearing ornate girdled chiton and Ionic peplos, on a terrace on a sunny day, ancient greece, fullbody p 00338-3094012949-a beautiful Greek woman, wearing ornate girdled chiton and Ionic peplos, on a terrace on a sunny day, ancient greece, fullbody p 00349-3094012949-a beautiful Greek woman, wearing ornate belted chiton and Ionic peplos, on a terrace on a sunny day, ancient greece, fullbody po 00350-3094012949-a beautiful Greek woman, wearing ornate belted chiton and Ionic peplos, on a terrace on a sunny day, ancient greece, fullbody po

gerstnr commented 1 year ago

Unfortunately I can confirm the issue. It also occurs without negative prompting or emphasis. Disabling xformers did not help. Mac OS M1 on latest main branch. Last Saturday everything was still working as expected.

My simple test case uses only SD 1.4, no face restoration or hires fix:

a sad cat Steps: 20, Sampler: Euler a, CFG scale: 7, Seed: 2641146490, Size: 512x512, Model hash: 7460a6fa

00159-2641146490-a sad cat 00160-2641146490-a sad cat

Happy to run more tests.

gerstnr commented 1 year ago

Tried to rule out config updates and other updates I did lately. Complete re-setup from latest commit did not change anything, well except the thing that should not – still getting various variations of sad cats.

With basically a vanilla setup. The only change I made in the Mac Setup Script is, that I went for python 3.9 instead of 3.1.

0xdevalias commented 1 year ago

A friend mentioned that using xformers could make things non-deterministic, and that there were a lot of references to it on the repo issues here. Wanting to understand a bit more about it, and link a bunch of potentially related issues together, I tried to find as many issues as I could that seemed to be related to xformers and the potential for it to be causing non-deterministic / unstable / inconsistent results:

The following may potentially be related (ordered by issue number):

Issues:

Discussions:

I also came across this thread in the xformers repo, which while I can't guarantee is related, am wondering if it might be:

And a question I raised on a PR in the diffusers repo: