comfyanonymous / ComfyUI

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
https://www.comfy.org/
GNU General Public License v3.0
56.04k stars 5.92k forks source link

I made an fp8 implementation of flux which gets ~3.5 it/s 1024x1024 on 4090 (ADA / Hopper & 16GB vram+ only) #4528

Open Charuru opened 2 months ago

Charuru commented 2 months ago

Feature Idea

Saw the claim on this reddit thread, hopefully the ideas there can also be brought into comfy for even more speedups.

https://www.reddit.com/r/StableDiffusion/comments/1ex64jj/i_made_an_fp8_implementation_of_flux_which_gets/

Existing Solutions

No response

Other

No response

DarkAlchy commented 2 months ago

It would appear this is Linux only? seems --fast throws an error when I gen as not compiled (what you use) for my platform cuda which is Windows.

mcmonkey4eva commented 2 months ago

--fast requires specifically updated torch 2.4, other torch versions won't work. It works on windows and linux

comfyanonymous commented 2 months ago

yeah if you want --fast you need pytorch 2.4 or later on windows. I recommend pytorch nightly, you can grab a standalone with pytorch nightly package here if you need one: https://github.com/comfyanonymous/ComfyUI/releases/tag/latest

DarkAlchy commented 2 months ago

yeah, I just found that out. Kohya is 12.4 cuda and 2.4 torch but comfyui is 12.1 and 2.3.0

DarkAlchy commented 2 months ago

yeah if you want --fast you need pytorch 2.4 or later on windows. I recommend pytorch nightly, you can grab a standalone with pytorch nightly package here if you need one: https://github.com/comfyanonymous/ComfyUI/releases/tag/latest

I do the venv not portable on windows so those were all portable versions.

comfyanonymous commented 2 months ago

ComfyUI is still 2.3.1 because 2.4 seems to have memory issues for some people on windows.

AugmentedRealityCat commented 2 months ago

Is there a precompiled verison of Xformers that is compatible with 2.4 ?

AugmentedRealityCat commented 2 months ago

Now there is a python wheel available to install a dev version of xformers that is compatible with cuda 12.4 and torch 2.4:

https://github.com/facebookresearch/xformers/actions/runs/10559887009

All the wheel file links are found in the lower half of that page, including the one I am using xformers-0.0.28.dev893+cu124-cp311-cp311-win_amd64.whl and other python wheels for various versions of python ( 3.8 to 3.12 ) and cuda ( 118 to 124 ), and there are rocm and ubuntu options as well but I haven't tried those.

DarkAlchy commented 2 months ago

I lost a few nodes in comfy that demand FA2 (even if not used) so I had to roll back to 2.1.2 just so I can use the nodes. Florence2 is one of the errant nodes.