Closed bkosowski closed 9 months ago
Prompt weighting on Diffusers backend is actually functional unlike Original backend and you are using extreme ranges of prompt weights.
Try to re-generate in both backends with no negative prompt - these 1000-token long "super universal negative prompts" likely do more harm than good anyway.
Changing the prompts to use no custom weights:
- Positive prompt:
A woman standing on a side walk, 40 years old, wearing a business suit, wearing glasses,
complex background, messy, detailed background, raining, cloudy day,
masterpiece, photorealistic, realistic, hyperrealistic, hyperdetailed, ultrarealistic, ultra highres, 4k, HDR, high detail, high quality, studio photo, professional photography, extra details, very detailed, intricate fine detail
- Negative prompt:
BeyondNegative_v4-neg, CyberRealistic_Negative,
portrait, closeup,
censored, deformed, grotesque, amputated, disfigured, mutilated, bad anatomy, poorly drawn face, mutated, extra limb, ugly, poorly drawn hands, missing limb, floating limbs, disconnected limbs, disconnected head, malformed hands, long neck, mutated hands and fingers, bad hands, missing fingers, cropped, worst quality, low quality, mutation, poorly drawn, huge calf, bad hands, fused hand, fused legs, missing hand, disappearing arms, disappearing thigh, disappearing calf, disappearing legs, missing fingers, fused fingers, abnormal eye proportion, abnormal hands, abnormal legs, abnormal feet, long feet, big feet, elongated feet, abnormal fingers, duplicated, mirrored, ugly, obese, fat,
monochrome, grayscale, b&w, black and white, oversaturated, sepia,
worst quality, low quality, normal quality, lowres, low resolution, jpeg artifacts, cropped, out of frame, canvas frame, border, frame, picture frame, haze, blur, blurry, unfocused, depth of field, text, error, username, signature, watermark, logo,
rendered, 3D render, Octane render, Cinema 4D, Blender, Unreal, Unreal Engine, 3ds Max, Maya, Milkshape 3D, Unity, CG, CGI, computer graphics, computer generated image, computer animation, video game, trending on CGSociety, drawing, cartoon, painting, illustration, anime, sketch,
Also produces a broken image (when compared to the original backend):
Try with just this on both backends:
Positive:
A woman standing on a side walk, 40 years old, wearing a business suit, wearing glasses,
complex background, messy, detailed background, raining, cloudy day,
Negative:
cartoon, painting, illustration, worst quality, low quality, normal quality
And i am assuming you are using DPM++ 2M with karras checkbox turned on.
Using only the positive prompt leads to an image than is of much worse quality than the image generated through the original backend with the negative prompt:
Can't the diffusers backend really handle negative prompt?
Can't the diffusers backend really handle negative prompt?
No, it is doing exactly what you told it to do.
Adding the simple negative prompt:
cartoon, painting, illustration, worst quality, low quality, normal quality
leads to an image than is also of much worse quality than the one generated through the original backend with a long negative prompt (but of course, much better than the image generated through the diffusers backend with the long negative prompt):
Yes, it's DPM++ 2M karras.
I understand that one can generate an image using different prompts. But the issue here is that the backend handles the prompts in a completely different way (worse, in my opinion). Is there some doc that explains the differences?
Difference is, it actually works as intended. So choose the things you don't want in the image on the negative and don't throw a soup of random words that doesn't make any sense.
Yes, it's DPM++ 2M karras.
Does it look like this? It should look like this: If not, refresh your page after changing backends.
thanks @Disty0 - i agree, this may not be what @bkosowski is used to, but goal here is not to be 100% like A1111. prompt parser -> tokenizer -> text encoder are doing their job as intended. and its actually much closer to how Comfy or Invoke handle prompts.
Issue Description
Images generated while on diffusers backend are of considerably worse quality, even when trying to match settings as much as possible. This is on SD 1.5 based models.
The two images speak for themselves. Here is an image generated through the original backend: And here is an image generated through the diffusers backend:
The quality loss is extreme!
Generation information:
The exact same thing happens after using
--reinstall
in the command line, waiting for everything to reinstall, and restarting the server.Am I don't something wrong? Is there some secret sauce that I'm missing?
Version Platform Description
Beginning of the log:
GPU:
Torch:
Libs:
Device info:
Cross attention:
System-info tab:
Launch command line args for the original backend:
--upgrade --use-cuda --models-dir "D:\AIArt\Models" --backend original --config config.json --ui-config ui-config.json --debug
Launch command line args for the diffusers backend:
--upgrade --use-cuda --models-dir "D:\AIArt\Models" --backend diffusers --config config.json --ui-config ui-config.json --debug
Relevant log output
sdnext.log
Backend
Diffusers
Branch
Master
Model
SD 1.5
Acknowledgements