Flux.1 and OSX with M1 - Githubissues

xhinker / sd_embed

Generate long weighted prompt embeddings for Stable Diffusion

Apache License 2.0

60 stars 6 forks source link

Flux.1 and OSX with M1 #14

Open odoremieux opened 1 month ago

odoremieux commented 1 month ago

I'm not able to make the Flux.1 work on OSX with M1, I'm getting

TypeError: Trying to convert Float8_e4m3fn to the MPS backend but it does not have support for that dtype.

Anybody has been successful?

Teriks commented 3 weeks ago

try qint8 quantization instead of qfloat8

odoremieux commented 2 weeks ago

I did try it, it doesn't generate that error again, but I get : Token indices sequence length is longer than the specified maximum sequence length for this model (80 > 77). Running this sequence through the model will result in indexing errors, And the generated image is all pixelized

Teriks commented 2 weeks ago

I did try it, it doesn't generate that error again, but I get : Token indices sequence length is longer than the specified maximum sequence length for this model (80 > 77). Running this sequence through the model will result in indexing errors, And the generated image is all pixelized

The message about token length is ignorable

https://github.com/xhinker/sd_embed/issues/11#issuecomment-2282300467

As for the image result I am not sure, I really wish I had a Mac myself to test MPS for my own projects

You might find some users discussing getting Flux working on MPS in this issue: https://github.com/huggingface/diffusers/issues/9047

There are some people experiencing pixelated images in this thread, might be related

odoremieux commented 2 weeks ago

Here is the prompt I used : A dreamy, soft-focus photograph capturing a romantic Jane Austen movie scene, in the style of Agnes Cecile. Delicate watercolors, misty background, Regency-era couple, tender embrace, period clothing, flowing dress, dappled sunlight, ethereal glow, gentle expressions, intricate lace, muted pastels, serene countryside, timeless romance, poetic atmosphere, wistful mood, look at camera.

It's using "black-forest-labs/FLUX.1-schnell"

The image generated : flux_image_a

Usually with that model, I don't have any issue, I'm just trying to go around the token limitation for the prompt

Teriks commented 2 weeks ago

Possibly a precision oddity in the embed generation if it works normally using diffusers default embed generation. i.e, passing a prompt and/or prompt_2 string to pipe.__call__ instead of directly passing embeds generated by sd_embed