xhinker / sd_embed

Generate long weighted prompt embeddings for Stable Diffusion
Apache License 2.0
85 stars 9 forks source link

Add Flux Support #9

Closed MohamedAliRashad closed 3 months ago

MohamedAliRashad commented 3 months ago

It would be nice if this library supported Flux.1

xhinker commented 3 months ago

working on it

xhinker commented 3 months ago

done

RedAndr commented 3 months ago

Unfortunately, it generates different images than text prompt.

xhinker commented 3 months ago

Unfortunately, it generates different images than text prompt.

What do you mean by generates different images?

RedAndr commented 3 months ago

@xhinker I meant FLUX.1 generates different images with text prompt and with embeddings, which sd_embed creates from the same text prompt. Here's an example: Fantastic 601 2 FLUX 1-schnell 0g 12s 1024x1024 Fantastic 601 3 FLUX 1-schnell 0g 12s 1024x1024 Prompt: Hyper-realistic photography of stray cat in a cyberpunk city. Scenery, intricate details, masterpiece, best quality. Model: FLUX.1-schnell Seed: 601 Guidance scale: 0 Steps: 12 Resolution: 1024x1024 The code is the same as in the lpw_flux1.py.

xhinker commented 3 months ago

Which one is generated by sd_embed ?

RedAndr commented 3 months ago

The second one. Frankly, it doesn't look worse, but just different. I'd expect it to be the same.

xhinker commented 3 months ago

I tried many, found that the sd_embed generates slight better image than the Diffuser's default prompt embedding, say, the first cat is actually not stray cat. it has cat collar, is not a stray cat.

In terms of why different images are generated, I am not quite sure, I build the Flux embedding based on my understanding of the model architecture, not copy from the Diffusers logic. there could be some difference in the implementation.

RedAndr commented 3 months ago

Ok, thanks for the explanation!