huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
https://huggingface.co/docs/diffusers
Apache License 2.0
25.16k stars 5.2k forks source link

can't use the negative_prompt module in the fluxpipeline #9124

Open DaaadShot opened 1 month ago

DaaadShot commented 1 month ago

The performance would be better if negative_prompt module be added,

I run fluxpipeline as below:

pipe = FluxPipeline.from_pretrained(model_path, torch_dtype=torch.bfloat16) image = pipe( prompt="a cat", negative_prompt="ugly, messy, watermark", width = OUTPUT_WIDTH, height = OUTPUT_HEIGHT, guidance_scale=0.0, max_sequence_length=256, output_type="pil", num_inference_steps=num_inference_step, #use a larger number if you are using [dev] generator=torch.Generator("cuda").manual_seed(seed) ).images[0]

Got this TypeError: FluxPipeline.call() got an unexpected keyword argument 'negative_prompt'

could you please support the negative_prompt or negative_embedding module in the fluxpipeline,thanks

a-r-r-o-w commented 1 month ago

When you use guidance, or more specifically Classifier-Free Guidance, you use two prompts - the positive prompt is the things you want in image and the negative prompt is things you don't want in the image. Effectively, what you're doing in CFG is generating an image with a negative prompt and positive prompt, and trying to "increase" the distance between them using guidance_scale. The image that is generated with a positive prompt is known as the conditional generation, while the one generated with a negative prompt is known as the unconditional generation. Before the negative prompt was called the "negative prompt", it was simply an empty string - meaning the generation is conditioned on nothing. Later, it was found that instead of conditioning on nothing, you could condition the generation on things that you don't want in the final image, and it would lead to better results. Flux, however, is a "guidance-distilled" model. A guidance-distilled model, in simple terms, can be thought of something that doesn't require the "increase distance between unconditional and conditional generation" part as mentioned above. So, it does not make sense to support a negative prompt here.

Ofcourse, like always, the diffusion community has discovered ways/"tricks" of using a negative prompt with Flux. Google search would be your friend for that

billnye2 commented 4 weeks ago

It does still make sense to use the negative prompt in flux, as demonstrated by countless posts on the stable diffusion subreddit. Diffusers should add it to stay relevant

asomoza commented 4 weeks ago

can you please show some examples where the negative prompt does indeed improve the image?

I've seen those countless reddit posts and each of them says that they're the best solution and that improves a lot the generation but I don't really see it, also there's a lot of posts about flux each day so probably I didn't see all of them or the best ones which probably you're referring.

I think if there's a noticeable improvement we can test and try to add one that really works or we can add it as a community pipeline if it's to hacky like some of the nodes I've seen.

billnye2 commented 4 weeks ago

Hmm must be difference in opinion or certain scenarios. When I initially saw those reddit posts I tested it and it helped to get rid of the blurry/bokeh background for the images I tested. Testing right now though I see certain positive prompts with specific word choice can remove the blurred background, but my initial testing couple weeks ago I saw a difference using negative prompts to get amateur photos

asomoza commented 4 weeks ago

oh yeah I know, I'm also struggling on how to get real normal amateur photos with Flux, IMO the reverse prompting works really well to get rid of the bokeh but still the images are too perfect.

If there's a negative prompt with a node that works with this I'll gladly test it and see how we can add it, but I think the real solution to that will come with loras or a fine tune of the model.

billnye2 commented 4 weeks ago

Yeah some loras have been good, only problem is so far they seem to deteriorate model in terms of body anatomy from my testing. When you mention reverse prompting, do you mean things like pez dispenser?

asomoza commented 4 weeks ago

Not sure what you mean (probably the same) but it's to describe the background as the subject and the subject as an addition to the image, better to show it:

normal prompt

a photo of a man at the park walking his dog with trees and a mountain in the background.

reverse prompt

a photo of a mountain and a park with trees where a man is walking his dog.

normal reverse
20240818235039_308335313 20240818234910_308335313
billnye2 commented 4 weeks ago

Oh damn cool strategy, that makes sense. The thing I was referring to was this

chuck-ma commented 1 week ago

The images generated by flux may have watermarks with a certain probability. I hope to alleviate this problem through negative prompt words. @asomoza

asomoza commented 1 week ago

that's what you hope for, the flux models don't work with a real "negative prompt" since they're distilled, some people try to add this with different "non-traditional" methods, as I wrote before there's quite a lot of them.

If you find one that "works" all the time for the "watermarks" or that behaves the traditional negative prompt, please can you link to the method or library and also post some images?

We can't add more complex code and invest time in something we don't if it works.

chuck-ma commented 1 week ago

Ok, thanks anyway.