huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
https://huggingface.co/docs/diffusers
Apache License 2.0
26.38k stars 5.43k forks source link

Is it possible diffusers implement an official support on the increasing or decreasing weight of prompt with () & []? #2431

Closed garyhxfang closed 1 year ago

garyhxfang commented 1 year ago

Is your feature request related to a problem? Please describe.

The currently the AUTOMATIC1111/stable-diffusion-web-ui support to increase or decrease the weight of an prompt with () & [] which is not supported by diffusers. (e.g. "best_quality (1girl:1.3) bow bride brown_hair closed_mouth frilled_bow frilled_hair_tubes frills (full_body:1.3) fox_ear hair_bow hair_tubes ((happy)) hood japanese_clothes kimono (long_sleeves) red_bow smile solo tabi uchikake white_kimono wide_sleeves cherry_blossoms")

I request for this feature because I found that for many models on civitai, some negative_prompt with certain weight are very very important to generate a good result. For example (worst quality:2), (low quality:2). I tried for a long time and found it almost impossible to generate result with similar quality with the negative prompt without the increase or decrease of weight. ( it try duplicating "worst quality" for different number of times(2 times, 3times or 4 times) in my negative prompt, but they all generate result with much worse quality than (worst quality:2))

Describe alternatives you've considered When investing for the solution , I found a community pipeline Long Prompt Weighting Stable Diffusion which supports this feature. But after I try it, I found it quite unstable that it will often stuck for the long time when I use it for inference, which means it cannot be used in production environment So I think a better alternative is that we can directly support in in the official StableDiffusionPipeline

Describe the solution you'd like The example how I would like be like is describe below

import torch
from diffusers import StableDiffusionPipeline

pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", weight_config=True)
pipe = pipe.to("cuda")

prompt = "best_quality (1girl:1.3) bow bride brown_hair closed_mouth frilled_bow frilled_hair_tubes frills (full_body:1.3) fox_ear hair_bow hair_tubes ((happy)) hood japanese_clothes kimono (long_sleeves) red_bow smile solo tabi uchikake white_kimono wide_sleeves cherry_blossoms"
negative_prompt = "(worst quality:2), (low quality:2)"
image = pipe(prompt=prompt, negative_prompt=negative_prompt).images[0]

Do hope that @patrickvonplaten could have a check on this request, it will be very helpful for us developers to generate better images that have the same or even better quality than the ones user generate with AUTOMATIC1111/stable-diffusion-web-ui.