Closed JustMaier closed 1 year ago
I'm having the same issue. Using the AND contidional formatter seems to greatly exacerbates it.
Prompt: A stuffed bear toy wearing a tophat AND a table ornament CFG scale: 7, Seed: 123456789, Size: 512x512
Without xformers:
With xformers:
The differences in this example are subtle, but whislt doing other testing, there were some where it was actually a completely different image.
Looks like sometimes it makes the image better, sometimes worse.
Perhaps this is just the nature of the optimization?
maybe? same thing happened when split attention became enabled by default https://github.com/AUTOMATIC1111/stable-diffusion-webui/commit/cae5c5fa8d88a6d4206ec7d89e53685d53afe4c0 k-diffusion updates seem to alter outputs as well https://github.com/AUTOMATIC1111/stable-diffusion-webui/commit/b6f80bdcc2191a43428f9491441a6f8e70b84be8 https://github.com/AUTOMATIC1111/stable-diffusion-webui/commit/c30c06db207a580d76544fd10fc1e03cd58ce85e
I've ran into this issue a lot while optimizing code on my own branches, It seems to happen mainly due to rounding differences, you can usually see by using fp32 instead of fp16 which will reduce differences quite a lot. Of course, the step from fp16 to fp32 itself also introduces a lot of small differences as well.
OK so this is normal I guess. This issue became very noticeable to me today when I was using a long complicated prompt using AND while also using a higher resolution with the hi res fix. I could generate the same identical prompt 5 times in a row and the output images will all be different variations of the same thing, sometimes drastically. Changes at smaller resolutions with simple prompts seem to be much more subtle.
I notice that with Euler and DDIM (512x512) it is able to reproduce the same image over and over. Diff from the one created with same info while not using --xformers. But that part is to be expected. At bigger sizes all the methods produce different variations with the same starting point info.
I think it would be normal to expect --xformers to change the outcome a bit, but that the changed outcome would be repeatable, not variant on each generation from the same starting point.
It also screws up [a|b], a AND b and TI embeddings big time.
A friend mentioned that using xformers could make things non-deterministic, and that there were a lot of references to it on the repo issues here. Wanting to understand a bit more about it, and link a bunch of potentially related issues together, I tried to find as many issues as I could that seemed to be related to xformers and the potential for it to be causing non-deterministic / unstable / inconsistent results:
The following may potentially be related (ordered by issue number):
Issues:
Discussions:
I also came across this thread in the xformers repo, which while I can't guarantee is related, am wondering if it might be:
And a question I raised on a PR in the diffusers repo:
A friend just mentioned that apparently using xformers tends to lower the quality of generated images, and apparently even causes StableDiffusion to generate different images for the same seed/settings. I haven't looked too deeply into things to try and validate that, but I was wondering if a) that is something that is currently known/documented, and b) if that's likely to be fixed at any point, or if it's just 'the price we pay' for the benefits it adds. Would any of that be a concern with integrating things here? I did stumble upon this issue on the xformers repo, but not sure if it would be the same/similar root cause as my friend described: https://github.com/facebookresearch/xformers/issues/219
Describe the bug The new
xformers
optimization has something that makes it so that when given the same parameters for generation the same image won't be generated, see examples below. It seems to be less noticeable given more steps.To Reproduce Steps to reproduce the behavior:
xformers
enhancement by adding--xformers
to launch paramsExpected behavior Generations should be consistent given the same parameters.
Screenshots
With xformers:
Without xformers:
With xformers + differences highlighted:
Without xformers + consistency highlighted:
Desktop (please complete the following information):
Additional context Perhaps this is just the nature of the optimization?