Support for Intel ARC GPUs

mindplay-dk commented 1 year ago

Any chance this will eventually have support for Intel ARC GPUs? The A770 is still the only affordable 16 GB GPU.

I'm sure AMD users are feeling a little left behind too.

yogaxu commented 1 year ago

Unfortunately, my personal PC only has Intel integrated GPU and AMD GPU, which makes it impossible to use such a great application.

makisukurisu commented 1 year ago

Bumping this one.

For example, there's: https://github.com/vladmandic/automatic This fork of AUTOMATIC1111's tool provides way more ways (he-he) to run generation:

I believe adding all thre would be an overkill, but can we at least expect to get DirectML/OpenVINO?

Thank you for this great tool! It's by far the easiest and convenient one to use.

mashb1t commented 10 months ago

@mindplay-dk @makisukurisu DirectML support has been added, please check if setting --directml works for you. There are also further instructions (only for AMD GPUs on WIndows, but may also be applicable for ARC, see https://github.com/lllyasviel/Fooocus?tab=readme-ov-file#windowsamd-gpus Your feedback is welcome!

mashb1t commented 10 months ago

Reference to discussion https://github.com/lllyasviel/Fooocus/discussions/1754#discussioncomment-8026436

makisukurisu commented 9 months ago

@mashb1t, thanks for letting us know (sadly — I have seen your response today, so, sorry for the late reply).

I've tried that option, but it didn't help in my particular case. But then — it's not the Fooocus-es problem, but of my machine.

If you have a table of options compatibility, you can add a note that the --directml option doesn't work for (at least) Intel i7-1165G7 with Intel Iris Xe Graphics (96 units). My assumption is (though I have no way to verify it) that all mobile Iris Xe GPUs from the same “generation” (of either GPU or CPU) will have the same issue.

With an error from pytorch (which has nothing to do with the code of Fooocus, I assume, and just reports a hardware incapability of my GPU):

[Fooocus Model Management] Moving model(s) has taken 11.76 seconds
  0%|                                                                                           | 0/30 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "C:\Files\git\Foocus\Fooocus\modules\async_worker.py", line 823, in worker
    handler(task)
  File "C:\Files\git\Foocus\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Files\git\Foocus\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Files\git\Foocus\Fooocus\modules\async_worker.py", line 754, in handler
    imgs = pipeline.process_diffusion(
  File "C:\Files\git\Foocus\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Files\git\Foocus\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Files\git\Foocus\Fooocus\modules\default_pipeline.py", line 361, in process_diffusion
    sampled_latent = core.ksampler(
  File "C:\Files\git\Foocus\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Files\git\Foocus\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Files\git\Foocus\Fooocus\modules\core.py", line 313, in ksampler
    samples = ldm_patched.modules.sample.sample(model,
  File "C:\Files\git\Foocus\Fooocus\ldm_patched\modules\sample.py", line 101, in sample
    samples = sampler.sample(noise, positive_copy, negative_copy, cfg=cfg, latent_image=latent_image, start_step=start_step, last_step=last_step, force_full_denoise=force_full_denoise, denoise_mask=noise_mask, sigmas=sigmas, callback=callback, disable_pbar=disable_pbar, seed=seed)
  File "C:\Files\git\Foocus\Fooocus\ldm_patched\modules\samplers.py", line 716, in sample
    return sample(self.model, noise, positive, negative, cfg, self.device, sampler, sigmas, self.model_options, latent_image=latent_image, denoise_mask=denoise_mask, callback=callback, disable_pbar=disable_pbar, seed=seed)
  File "C:\Files\git\Foocus\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Files\git\Foocus\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Files\git\Foocus\Fooocus\modules\sample_hijack.py", line 157, in sample_hacked
    samples = sampler.sample(model_wrap, sigmas, extra_args, callback_wrap, noise, latent_image, denoise_mask, disable_pbar)
  File "C:\Files\git\Foocus\Fooocus\ldm_patched\modules\samplers.py", line 561, in sample
    samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options)
  File "C:\Files\git\Foocus\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Files\git\Foocus\Fooocus\ldm_patched\k_diffusion\sampling.py", line 701, in sample_dpmpp_2m_sde_gpu
    return sample_dpmpp_2m_sde(model, x, sigmas, extra_args=extra_args, callback=callback, disable=disable, eta=eta, s_noise=s_noise, noise_sampler=noise_sampler, solver_type=solver_type)
  File "C:\Files\git\Foocus\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Files\git\Foocus\Fooocus\ldm_patched\k_diffusion\sampling.py", line 613, in sample_dpmpp_2m_sde
    denoised = model(x, sigmas[i] * s_in, **extra_args)
  File "C:\Files\git\Foocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Files\git\Foocus\Fooocus\modules\patch.py", line 314, in patched_KSamplerX0Inpaint_forward
    out = self.inner_model(x, sigma,
  File "C:\Files\git\Foocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Files\git\Foocus\Fooocus\ldm_patched\modules\samplers.py", line 275, in forward
    return self.apply_model(*args, **kwargs)
  File "C:\Files\git\Foocus\Fooocus\ldm_patched\modules\samplers.py", line 272, in apply_model
    out = sampling_function(self.inner_model, x, timestep, uncond, cond, cond_scale, model_options=model_options, seed=seed)
  File "C:\Files\git\Foocus\Fooocus\modules\patch.py", line 229, in patched_sampling_function
    positive_x0, negative_x0 = calc_cond_uncond_batch(model, cond, uncond, x, timestep, model_options)
  File "C:\Files\git\Foocus\Fooocus\ldm_patched\modules\samplers.py", line 226, in calc_cond_uncond_batch
    output = model.apply_model(input_x, timestep_, **c).chunk(batch_chunks)
  File "C:\Files\git\Foocus\Fooocus\ldm_patched\modules\model_base.py", line 85, in apply_model
    model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds).float()
  File "C:\Files\git\Foocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Files\git\Foocus\Fooocus\modules\patch.py", line 371, in patched_unet_forward
    self.current_step = 1.0 - timesteps.to(x) / 999.0
  File "C:\Files\git\Foocus\python_embeded\lib\site-packages\torch\_tensor.py", line 40, in wrapped
    return f(*args, **kwargs)
  File "C:\Files\git\Foocus\python_embeded\lib\site-packages\torch\_tensor.py", line 848, in __rsub__
    return _C._VariableFunctions.rsub(self, other)
RuntimeError: The GPU device does not support Double (Float64) operations!
Total time: 34.30 seconds

Once again — thanks for your kind response, have a lovely day! P.S. Just to clarify, I wasn't expecting much, but just wanted to try this option in any case with my iGPU, since there is a AMD GPU entry, and, from my experience, AMD iGPUs are quite capable. (AFAIK there's little to no difference between some of their iGPUs and GPUs in the architecture, apart from the compute modules number, however I may be wrong, and this is not important anyway)

cryscript commented 9 months ago

@makisukurisu I successfully started it on Arc A770 on Windows 10. For Windows I recommend you to use these pip modules https://github.com/Nuullll/intel-extension-for-pytorch/releases/ or build it by yourself from https://github.com/intel/intel-extension-for-pytorch

In short the installation should be something like this (Windows 10, python 3.10):

git clone https://github.com/lllyasviel/Fooocus.git
cd Fooocus
python -m venv venv
 .\venv\Scripts\activate.bat
python -m pip install "https://github.com/Nuullll/intel-extension-for-pytorch/releases/download/v2.1.10%2Bxpu/torch-2.1.0a0+cxx11.abi-cp310-cp310-win_amd64.whl" "https://github.com/Nuullll/intel-extension-for-pytorch/releases/download/v2.1.10%2Bxpu/torchvision-0.16.0a0+cxx11.abi-cp310-cp310-win_amd64.whl" "https://github.com/Nuullll/intel-extension-for-pytorch/releases/download/v2.1.10%2Bxpu/intel_extension_for_pytorch-2.1.10+xpu-cp310-cp310-win_amd64.whl"
python -m pip install -r requirements_versions.txt
python entry_with_update.py --disable-analytics --theme dark --unet-in-bf16 --vae-in-bf16 --clip-in-fp16

And it will run over IPEX, which should be faster than DirectML.

for next run after installation you will need to use: (you can create start.bat with this content and just use start.bat further

 .\venv\Scripts\activate.bat
python entry_with_update.py --disable-analytics --theme dark --unet-in-bf16 --unet-in-bf16 --vae-in-bf16 --clip-in-fp16

Also you can try to run your current installation on DirectML with extra args --unet-in-bf16 --vae-in-bf16 --clip-in-fp16

mashb1t commented 9 months ago

@cryscript would you like to create a PR with changes/addition of your steps to the readme? Much appreciated!

cryscript commented 9 months ago

@mashb1t I created a PR and tried to describe as clear and detailed as possible how to do everything. https://github.com/lllyasviel/Fooocus/pull/2120

jfyi: torch and intel_extension_for_pytorch wheels from Nuullll/intel-extension-for-pytorch because they re-packaged:

with all the dependent dll files baked in torch and intel_extension_for_pytorch wheels, the users can simply install and use IPEX without installing extra oneAPI packages from Intel.

Thank you!

charliekayaker commented 8 months ago

Hi guys, I try dfferents ways but it doesn't work. Do you find other solution ?} My processor and gpu are

lllyasviel / Fooocus

Support for Intel ARC GPUs #141