invoke-ai / InvokeAI

Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The solution offers an industry leading WebUI, and serves as the foundation for multiple commercial products.
https://invoke-ai.github.io/InvokeAI/
Apache License 2.0
23.79k stars 2.45k forks source link

[bug]: Flux models doesnt work on Mac M2 device. Gets an error like this -> ImportError: The bnb modules are not available. Please install bitsandbytes if available on your platform. #6965

Open moheshmohan opened 2 months ago

moheshmohan commented 2 months ago

Is there an existing issue for this problem?

Operating system

macOS

GPU vendor

Apple Silicon (MPS)

GPU model

No response

GPU VRAM

No response

Version number

5.0.0

Browser

chrome 129.0.6668.60

Python dependencies

No response

What happened

When i run flux models i get the below error

ImportError: The bnb modules are not available. Please install bitsandbytes if available on your platform.

What you expected to happen

I expected flux models to run. I have been using flux on same device using other software like diffusionbee

How to reproduce the problem

No response

Additional context

No response

Discord username

No response

colinux commented 2 months ago

I installed bitsandbytes manually (pip3 install bitsandbytes in the virtual env). Then there is the error TypeError: BFloat16 is not supported on MPS, which lead to me to https://github.com/bitsandbytes-foundation/bitsandbytes/issues/1020 : port for MPS architecture is still a WIP. From my understanding we have to wait this to be completed, unless another solution emerges with different quantizations or compression techniques. I tried to import the Flux fp8 checkpoints from DrawThings but they are not compatible.

Adreitz commented 2 months ago

As far as I know, you will need to update torch and torchvision to a more recent version to get bfloat16 support on MPS. You can try torch 2.3.1 or a recent nightly build. However, I don't know if this will help you get quantized Flux working, as MPS does not support fp8 at all.

Vargol commented 1 month ago

Yep, I'm afraid you need enough memory to run the fp16 version of Flux at the moment as bitsandbtyes doesn't support MacOS.

And you'll need to upgrade torch in your InvokeAI venv.

You also need a couple of code changes doing to the InvokeAI code (which in theory now depends on what version of torch you upgrade to, with 2.3.1 you need a change a function call that is in theory fixed in the PyTorch nightlies now I not tested it yet).

vicento commented 1 month ago

following this. i face the same problem ImportError: The bnb modules are not available. Please install bitsandbytes if available on your platform.

Vargol commented 1 month ago

Are you still trying to use a quantised model ? I'll not got bitsandbytes installed and the non quantised Flux [schnell] model works without issue (apart from being really slow as I've not quite got enough memory) none of the quantised formats used for Flux work on a Mac currently once the code changes have been made

(InvokeAI) M3iMac:InvokeAI $ pip show bitsandbytes
WARNING: Package(s) not found: bitsandbytes
(InvokeAI) M3iMac:InvokeAI $ cat ~/bin/run_invoke.sh 
export INVOKEAI_ROOT=/Users/xxxx/invokeai
export PYTORCH_ENABLE_MPS_FALLBACK=1
cd /Volumes/SSD2TB/AI/InvokeAI 
. bin/activate
invokeai-web

(InvokeAI) M3iMac:InvokeAI $ ~/bin/run_invoke.sh 
...
100%|████████████████████████████████| 4/4 [06:18<00:00, 94.59s/it]
[2024-09-30 18:20:58,841]::[InvokeAI]::INFO --> Graph stats: a751e29a-1cd7-4d97-afc3-211f2cecb821
                          Node   Calls   Seconds  VRAM Used
             flux_model_loader       1    0.008s     0.000G
             flux_text_encoder       1  199.630s     0.000G
                  flux_denoise       1  421.639s     0.000G
                 core_metadata       1    0.014s     0.000G
               flux_vae_decode       1    7.373s     0.000G
TOTAL GRAPH EXECUTION TIME: 628.665s
TOTAL GRAPH WALL TIME: 628.688s
RAM used by InvokeAI process: 0.81G (+0.271G)
RAM used to load models: 40.50G
RAM cache statistics:
   Model cache hits: 6
   Model cache misses: 6
   Models cached: 1
   Models cleared from cache: 1
   Cache high water mark: 22.15/11.00G

[2024-09-30 18:20:58,888]::[uvicorn.acce
cchance27 commented 1 month ago

GGUF work on 2.4.1 pytorch, (nightly pytorch break GGUF)... at least in comfy, and it works with all other nodes.

Also their are now MLX nodes as well for comfy to load 4bit versions of flux ... but that has other issues (no compatability with most things yet)