[Bug]: Image generation with Stable Diffusion XL and OpenVINO on iGPU LNL

azhuvath commented 1 month ago

OpenVINO Version

2024.4

Operating System

Windows System

Device used for inference

GPU

Framework

None

Model used

stabilityai/stable-diffusion-xl-base-1.0

Issue description

Trying to follow the below notebook on a LNL Windows machine targeting iGPU. https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/stable-diffusion-xl/stable-diffusion-xl.ipynb

Getting the following error. Other models works on the GPU and hence the issue is not with GPU or the drivers. This is a INT8 quantized model.

onednn_verbose,info,oneDNN v3.6.0 (commit 4ccd07e3a10e1c08075cf824ac14708245fbc334)
onednn_verbose,primitive,error,gpu,jit::gemm,Insufficient registers in requested bundle
onednn_verbose,primitive,error,gpu,jit::gemm,Insufficient registers in requested bundle
onednn_verbose,info,gpu,runtime:OpenCL
onednn_verbose,primitive,error,gpu,jit::gemm,Insufficient registers in requested bundle
onednn_verbose,primitive,error,gpu,jit::gemm,Insufficient registers in requested bundle
onednn_verbose,primitive,error,gpu,jit::gemm,Insufficient registers in requested bundle
onednn_verbose,info,gpu,engine,0,name:Intel(R) Arc(TM) 140V GPU (16GB),driver_version:32.0.101,binary_kernels:enabled
onednn_verbose,primitive,info,template:operation,engine,primitive,implementation,prop_kind,memory_descriptors,attributes,auxiliary,problem_desc,exec_time
onednn_verbose,primitive,error,gpu,jit::gemm,Insufficient registers in requested bundle
onednn_verbose,primitive,error,gpu,jit::gemm,Insufficient registers in requested bundle
onednn_verbose,primitive,error,gpu,jit::gemm,Insufficient registers in requested bundle
Traceback (most recent call last):
  File "C:\Users\devcloud\image_generation\generate_image.py", line 18, in <module>
    text2image_pipe.compile()
  File "C:\Users\devcloud\image_generation\sd_env\lib\site-packages\optimum\intel\openvino\modeling_diffusion.py", line 687, in compile
    component._compile()
  File "C:\Users\devcloud\image_generation\sd_env\lib\site-packages\optimum\intel\openvino\modeling_diffusion.py", line 779, in _compile
    self.request = core.compile_model(self.model, self._device, self.ov_config)
  File "C:\Users\devcloud\image_generation\sd_env\lib\site-packages\openvino\runtime\ie_api.py", line 543, in compile_model
    super().compile_model(model, device_name, {} if config is None else config),
RuntimeError: Exception from src/inference/src/cpp/core.cpp:107:
Exception from src/inference/src/dev/plugin.cpp:53:
Check 'false' failed at src/plugins/intel_gpu/src/plugin/program_builder.cpp:185:
[GPU] ProgramBuilder build failed!
Exception from src/plugins/intel_gpu/src/graph/include\primitive_type_base.h:59:
[GPU] Can't choose implementation for fullyconnectedcompressed:__module.model.down_blocks.1.resnets.0.time_emb_proj/aten::linear/MatMul node (type=fully_connected)   
[GPU] Original name: __module.model.down_blocks.1.resnets.0.time_emb_proj/aten::linear/MatMul
[GPU] Original type: FullyConnectedCompressed
[GPU] Reason: could not create a primitive

Step-by-step reproduction

import torch
from time import time
from pathlib import Path
from optimum.intel import OVStableDiffusionXLPipeline

start = time()
model_dir = Path("model_int8")

device = 'CPU'
text2image_pipe = OVStableDiffusionXLPipeline.from_pretrained(model_dir, device=device)
# Define the shapes related to the inputs and desired outputs
batch_size, num_images, height, width = 1, 1, 512, 512

# Statically reshape the model
text2image_pipe.reshape(batch_size, height, width, num_images)

# Compile the model before inference
text2image_pipe.compile()

prompt = "cute cat 4k, high-res, masterpiece, best quality, full hd, extremely detailed,  soft lighting, dynamic angle, 35mm"
image = text2image_pipe(
    prompt,
    num_inference_steps=25,
    height=height,
    width=width,
    num_images_per_prompt=num_images,
    generator=torch.Generator(device="cpu").manual_seed(903512),
).images[0]
image.save("cat.png")
end = time()
time_taken = end - start

print(f"Time taken for image generation: {time_taken:.2f} seconds")

Relevant log output

No response

Issue submission checklist

[X] I'm reporting an issue. It's not a question.
[X] I checked the problem with the documentation, FAQ, open issues, Stack Overflow, etc., and have not found a solution.
[X] There is reproducer code and related data files such as images, videos, models, etc.

Iffa-Intel commented 3 weeks ago

@azhuvath could you clarify:

You mentioned other models, is it the same model with different precision or they are completely different model?
Did you use this command optimum-cli export openvino -m stabilityai/stable-diffusion-xl-base-1.0 --weight-format int8 {model_dir} as implemented in the openvino-notebook, to get the INT8 format?
Did you convert and run the model using OV 2024? (this is to confirm whether you use converted model files from older version into newer)

azhuvath commented 2 weeks ago

@azhuvath could you clarify:

You mentioned other models, is it the same model with different precision or they are completely different model?

Did you use this command optimum-cli export openvino -m stabilityai/stable-diffusion-xl-base-1.0 --weight-format int8 {model_dir} as implemented in the openvino-notebook, to get the INT8 format?

Did you convert and run the model using OV 2024? (this is to confirm whether you use converted model files from older version into newer)

Quantized the model with Optimum CLI and used OV 2024.4 version which is the latest. Will test it once more after getting access to the system.

openvinotoolkit / openvino

[Bug]: Image generation with Stable Diffusion XL and OpenVINO on iGPU LNL #27218