nod-ai / SHARK

SHARK - High Performance Machine Learning Distribution
Apache License 2.0
1.4k stars 170 forks source link

Failed to compile Unet #1378

Open RomanADavis opened 1 year ago

RomanADavis commented 1 year ago

Trying to run off CPU. I have 16 gigs of RAM.

shark_tank local cache is located at C:\Users\Roman\.local/shark_tank/ . You may change this by setting the --local_tank_cache= flag
vulkan devices are available.
cuda devices are not available.
diffusers\models\cross_attention.py:30: FutureWarning: Importing from cross_attention is deprecated. Please import from diffusers.models.attention_processor instead.
Running on local URL:  http://0.0.0.0:8080

To create a public link, set `share=True` in `launch()`.
Found device Intel(R) Iris(R) Xe Graphics. Using target triple .
Tuned models are currently not supported for this setting.
Downloading (…)cheduler_config.json: 100%|████████████████████████████████████████████████████| 345/345 [00:00<?, ?B/s]
huggingface_hub\file_download.py:133: UserWarning: `huggingface_hub` cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\Users\Roman\.cache\huggingface\hub. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the `HF_HUB_DISABLE_SYMLINKS_WARNING` environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
failed to download model, falling back and using import_mlir
No vmfb found. Compiling and saving to C:\Users\Roman\Documents\euler_scale_model_input_1_512_512_vulkan_fp16.vmfb
Optimized kernel for your target device is not added yet.
        Contact SHARK Admin on discord[https://discord.com/invite/RUqY2h2s9u]
        or pull up an issue.
Target : deviceName        = Intel(R) Iris(R) Xe Graphics
Saved vmfb in C:\Users\Roman\Documents\euler_scale_model_input_1_512_512_vulkan_fp16.vmfb.
No vmfb found. Compiling and saving to C:\Users\Roman\Documents\euler_step_1_512_512_vulkan_fp16.vmfb
Optimized kernel for your target device is not added yet.
        Contact SHARK Admin on discord[https://discord.com/invite/RUqY2h2s9u]
        or pull up an issue.
Target : deviceName        = Intel(R) Iris(R) Xe Graphics
Saved vmfb in C:\Users\Roman\Documents\euler_step_1_512_512_vulkan_fp16.vmfb.
use_tuned? sharkify: False
_1_64_512_512_fp16_stable-diffusion-2-1-base
Downloading (…)tokenizer/vocab.json: 100%|████████████████████████████████████████| 1.06M/1.06M [00:00<00:00, 3.99MB/s]
Downloading (…)tokenizer/merges.txt: 100%|██████████████████████████████████████████| 525k/525k [00:00<00:00, 2.89MB/s]
Downloading (…)cial_tokens_map.json: 100%|████████████████████████████████████████████████████| 460/460 [00:00<?, ?B/s]
Downloading (…)okenizer_config.json: 100%|████████████████████████████████████████████████████| 824/824 [00:00<?, ?B/s]
Downloading (…)_encoder/config.json: 100%|████████████████████████████████████████████████████| 613/613 [00:00<?, ?B/s]
Downloading model.safetensors: 100%|██████████████████████████████████████████████| 1.36G/1.36G [02:32<00:00, 8.95MB/s]
transformers\modeling_utils.py:429: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  with safe_open(checkpoint_file, framework="pt") as f:
No vmfb found. Compiling and saving to C:\Users\Roman\Documents\clip_1_64_512_512_fp16_stable-diffusion-2-1-base_vulkan.vmfb
Optimized kernel for your target device is not added yet.
        Contact SHARK Admin on discord[https://discord.com/invite/RUqY2h2s9u]
        or pull up an issue.
Target : deviceName        = Intel(R) Iris(R) Xe Graphics
Saved vmfb in C:\Users\Roman\Documents\clip_1_64_512_512_fp16_stable-diffusion-2-1-base_vulkan.vmfb.
Downloading (…)ain/unet/config.json: 100%|████████████████████████████████████████████████████| 911/911 [00:00<?, ?B/s]
Downloading (…)ch_model.safetensors: 100%|████████████████████████████████████████| 3.46G/3.46G [06:26<00:00, 8.97MB/s]
No vmfb found. Compiling and saving to C:\Users\Roman\Documents\unet_1_64_512_512_fp16_stable-diffusion-2-1-base_vulkan.vmfb
Optimized kernel for your target device is not added yet.
        Contact SHARK Admin on discord[https://discord.com/invite/RUqY2h2s9u]
        or pull up an issue.
Target : deviceName        = Intel(R) Iris(R) Xe Graphics
Error invoking IREE compiler tool iree-compile.exe
Diagnostics:
<unknown>:0: error: failed to legalize operation 'vector.bitcast'
<eval_with_key>.12:28:12: error: failed to run translation of source executable to target executable for backend #hal.executable.target<"vulkan", "vulkan-spirv-fb", {spirv.target_env = #spirv.target_env<#spirv.vce<v1.3, [Shader, GroupNonUniform], [SPV_KHR_storage_buffer_storage_class, SPV_KHR_variable_pointers]>, api=Vulkan, #spirv.resource_limits<max_compute_workgroup_size = [128, 128, 64], subgroup_size = 64, cooperative_matrix_properties_nv = []>>}>
<eval_with_key>.12:28:12: error: failed to serialize executables
<unknown>:0: error: failed to legalize operation 'vector.bitcast'
<eval_with_key>.12:33:14: error: failed to run translation of source executable to target executable for backend #hal.executable.target<"vulkan", "vulkan-spirv-fb", {spirv.target_env = #spirv.target_env<#spirv.vce<v1.3, [Shader, GroupNonUniform], [SPV_KHR_storage_buffer_storage_class, SPV_KHR_variable_pointers]>, api=Vulkan, #spirv.resource_limits<max_compute_workgroup_size = [128, 128, 64], subgroup_size = 64, cooperative_matrix_properties_nv = []>>}>
<eval_with_key>.12:33:14: error: failed to serialize executables
<unknown>:0: error: failed to legalize operation 'vector.bitcast'
<eval_with_key>.12:71:14: error: failed to run translation of source executable to target executable for backend #hal.executable.target<"vulkan", "vulkan-spirv-fb", {spirv.target_env = #spirv.target_env<#spirv.vce<v1.3, [Shader, GroupNonUniform], [SPV_KHR_storage_buffer_storage_class, SPV_KHR_variable_pointers]>, api=Vulkan, #spirv.resource_limits<max_compute_workgroup_size = [128, 128, 64], subgroup_size = 64, cooperative_matrix_properties_nv = []>>}>
<eval_with_key>.12:71:14: error: failed to serialize executables
<unknown>:0: error: failed to legalize operation 'vector.bitcast'
<eval_with_key>.12:144:12: error: failed to run translation of source executable to target executable for backend #hal.executable.target<"vulkan", "vulkan-spirv-fb", {spirv.target_env = #spirv.target_env<#spirv.vce<v1.3, [Shader, GroupNonUniform], [SPV_KHR_storage_buffer_storage_class, SPV_KHR_variable_pointers]>, api=Vulkan, #spirv.resource_limits<max_compute_workgroup_size = [128, 128, 64], subgroup_size = 64, cooperative_matrix_properties_nv = []>>}>
<eval_with_key>.12:144:12: error: failed to serialize executables
<unknown>:0: error: failed to legalize operation 'vector.bitcast'
<eval_with_key>.12:162:11: error: failed to run translation of source executable to target executable for backend #hal.executable.target<"vulkan", "vulkan-spirv-fb", {spirv.target_env = #spirv.target_env<#spirv.vce<v1.3, [Shader, GroupNonUniform], [SPV_KHR_storage_buffer_storage_class, SPV_KHR_variable_pointers]>, api=Vulkan, #spirv.resource_limits<max_compute_workgroup_size = [128, 128, 64], subgroup_size = 64, cooperative_matrix_properties_nv = []>>}>
<eval_with_key>.12:162:11: error: failed to serialize executables

Invoked with:
 iree-compile.exe C:\Users\Roman\AppData\Local\Temp\_MEI126322\iree\compiler\tools\..\_mlir_libs\iree-compile.exe - --iree-input-type=tm_tensor --iree-vm-bytecode-module-output-format=flatbuffer-binary --iree-hal-target-backends=vulkan --iree-llvmcpu-embedded-linker-path=C:\Users\Roman\AppData\Local\Temp\_MEI126322\iree\compiler\tools\..\_mlir_libs\iree-lld.exe --mlir-print-debuginfo --mlir-print-op-on-diagnostic=false --iree-llvmcpu-target-cpu-features=host --iree-stream-resource-index-bits=64 --iree-vm-target-index-bits=64 --iree-vm-bytecode-module-strip-source-map=true --iree-util-zero-fill-elided-attrs --iree-preprocessing-pass-pipeline=builtin.module(func.func(iree-flow-detach-elementwise-from-named-ops,iree-flow-convert-1x1-filter-conv2d-to-matmul,iree-preprocessing-convert-conv2d-to-img2col,iree-preprocessing-pad-linalg-ops{pad-size=32}))

Need more information? Set IREE_SAVE_TEMPS=/some/dir in your environment to save all artifacts and reproducers.

Retrying with a different base model configuration
mat1 and mat2 shapes cannot be multiplied (128x768 and 1024x320)
Retrying with a different base model configuration
Given groups=1, weight of size [320, 4, 3, 3], expected input[2, 9, 64, 64] to have 4 channels, but got 9 channels instead
Retrying with a different base model configuration
Given groups=1, weight of size [320, 4, 3, 3], expected input[2, 9, 64, 64] to have 4 channels, but got 9 channels instead
Retrying with a different base model configuration
Given groups=1, weight of size [320, 4, 3, 3], expected input[4, 7, 512, 512] to have 4 channels, but got 7 channels instead
Retrying with a different base model configuration
Traceback (most recent call last):
  File "gradio\routes.py", line 401, in run_predict
  File "gradio\blocks.py", line 1302, in process_api
  File "gradio\blocks.py", line 1039, in call_function
  File "anyio\to_thread.py", line 31, in run_sync
  File "anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread
  File "anyio\_backends\_asyncio.py", line 867, in run
  File "gradio\utils.py", line 491, in async_iteration
  File "ui\txt2img_ui.py", line 177, in txt2img_inf
  File "apps\stable_diffusion\src\pipelines\pipeline_shark_stable_diffusion_txt2img.py", line 122, in generate_images
  File "apps\stable_diffusion\src\pipelines\pipeline_shark_stable_diffusion_utils.py", line 203, in produce_img_latents
  File "apps\stable_diffusion\src\pipelines\pipeline_shark_stable_diffusion_utils.py", line 103, in load_unet
  File "apps\stable_diffusion\src\models\model_wrappers.py", line 640, in unet
  File "apps\stable_diffusion\src\models\model_wrappers.py", line 635, in unet
  File "apps\stable_diffusion\src\models\model_wrappers.py", line 59, in check_compilation
SystemExit: Could not compile Unet. Please create an issue with the detailed log at https://github.com/nod-ai/SHARK/issues
Tuned models are currently not supported for this setting.
Downloading (…)cheduler_config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 313/313 [00:00<00:00, 79.8kB/s]
failed to download model, falling back and using import_mlir
No vmfb found. Compiling and saving to C:\Users\Roman\Documents\euler_scale_model_input_1_512_512_cpu_fp16.vmfb
Target triple found:x86_64-pc-windows-msvc
Saved vmfb in C:\Users\Roman\Documents\euler_scale_model_input_1_512_512_cpu_fp16.vmfb.
No vmfb found. Compiling and saving to C:\Users\Roman\Documents\euler_step_1_512_512_cpu_fp16.vmfb
Target triple found:x86_64-pc-windows-msvc
Saved vmfb in C:\Users\Roman\Documents\euler_step_1_512_512_cpu_fp16.vmfb.
use_tuned? sharkify: False
_1_64_512_512_fp16_stable-diffusion-v1-4
Downloading (…)tokenizer/vocab.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.06M/1.06M [00:00<00:00, 2.75MB/s]
Downloading (…)tokenizer/merges.txt: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 525k/525k [00:00<00:00, 1.81MB/s]
Downloading (…)cial_tokens_map.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 472/472 [00:00<00:00, 235kB/s]
Downloading (…)okenizer_config.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 806/806 [00:00<00:00, 329kB/s]
Downloading (…)_encoder/config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 592/592 [00:00<?, ?B/s]
Downloading model.safetensors: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 492M/492M [00:55<00:00, 8.90MB/s]
No vmfb found. Compiling and saving to C:\Users\Roman\Documents\clip_1_64_512_512_fp16_stable-diffusion-v1-4_cpu.vmfb
Target triple found:x86_64-pc-windows-msvc
Saved vmfb in C:\Users\Roman\Documents\clip_1_64_512_512_fp16_stable-diffusion-v1-4_cpu.vmfb.
Downloading (…)ain/unet/config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 743/743 [00:00<?, ?B/s]
Downloading (…)ch_model.safetensors: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3.44G/3.44G [06:36<00:00, 8.67MB/s]
mat1 and mat2 shapes cannot be multiplied (128x1024 and 768x320)
Retrying with a different base model configuration
No vmfb found. Compiling and saving to C:\Users\Roman\Documents\unet_1_64_512_512_fp16_stable-diffusion-v1-4_cpu.vmfb
Target triple found:x86_64-pc-windows-msvc
Error invoking IREE compiler tool iree-compile.exe
Diagnostics:
iree-lld: error: undefined symbol: fmaxf
>>> referenced by <eval_with_key>.25:187
>>>               C:\Users\Roman\AppData\Local\Temp\llvm_module_linked_llvm_cpu-826986.o:(forward_dispatch_49_softmax_2x8x4096x4096xf16)
>>> referenced by <eval_with_key>.25:187
>>>               C:\Users\Roman\AppData\Local\Temp\llvm_module_linked_llvm_cpu-826986.o:(forward_dispatch_49_softmax_2x8x4096x4096xf16)
>>> referenced by <eval_with_key>.25:187
>>>               C:\Users\Roman\AppData\Local\Temp\llvm_module_linked_llvm_cpu-826986.o:(forward_dispatch_49_softmax_2x8x4096x4096xf16)
>>> referenced 1733 more times
>>> did you mean: fmaf
>>> defined in: C:\Users\Roman\AppData\Local\Temp\llvm_module_linked_llvm_cpu-826986.o
Linking failed; escaped command line returned exit code 1:

set LLD_VERSION=IREE && C:\Users\Roman\AppData\Local\Temp\_MEI126322\iree\compiler\tools\..\_mlir_libs\iree-lld.exe -flavor gnu -o C:\Users\Roman\AppData\Local\Temp\llvm_module_linked_llvm_cpu-826986.so --build-id=none -nostdlib -static -shared --no-undefined --no-allow-shlib-undefined --allow-multiple-definition --gc-sections -z now -z relro --discard-all --icf=all --ignore-data-address-equality --ignore-function-address-equality --hash-style=sysv C:\Users\Roman\AppData\Local\Temp\llvm_module_linked_llvm_cpu-826986.o

<unknown>:0: error: failed to link executable and generate target dylib (check above for more specific error messages)
<unknown>:0: error: failed to serialize executable for target backend llvm-cpu
<unknown>:0: error: failed to serialize executables

Invoked with:
 iree-compile.exe C:\Users\Roman\AppData\Local\Temp\_MEI126322\iree\compiler\tools\..\_mlir_libs\iree-compile.exe - --iree-input-type=tm_tensor --iree-vm-bytecode-module-output-format=flatbuffer-binary --iree-hal-target-backends=llvm-cpu --iree-llvmcpu-embedded-linker-path=C:\Users\Roman\AppData\Local\Temp\_MEI126322\iree\compiler\tools\..\_mlir_libs\iree-lld.exe --mlir-print-debuginfo --mlir-print-op-on-diagnostic=false --iree-llvmcpu-target-cpu-features=host --iree-llvmcpu-target-triple=x86_64-pc-windows-msvc --iree-stream-resource-index-bits=64 --iree-vm-target-index-bits=64 --iree-vm-bytecode-module-strip-source-map=true --iree-util-zero-fill-elided-attrs --iree-preprocessing-pass-pipeline=builtin.module(func.func(iree-flow-detach-elementwise-from-named-ops,iree-flow-convert-1x1-filter-conv2d-to-matmul,iree-preprocessing-convert-conv2d-to-img2col,iree-preprocessing-pad-linalg-ops{pad-size=32}))

Need more information? Set IREE_SAVE_TEMPS=/some/dir in your environment to save all artifacts and reproducers.

Retrying with a different base model configuration
Given groups=1, weight of size [320, 4, 3, 3], expected input[2, 9, 64, 64] to have 4 channels, but got 9 channels instead
Retrying with a different base model configuration
Given groups=1, weight of size [320, 4, 3, 3], expected input[2, 9, 64, 64] to have 4 channels, but got 9 channels instead
Retrying with a different base model configuration
Given groups=1, weight of size [320, 4, 3, 3], expected input[4, 7, 512, 512] to have 4 channels, but got 7 channels instead
Retrying with a different base model configuration
Traceback (most recent call last):
  File "gradio\routes.py", line 401, in run_predict
  File "gradio\blocks.py", line 1302, in process_api
  File "gradio\blocks.py", line 1039, in call_function
  File "anyio\to_thread.py", line 31, in run_sync
  File "anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread
  File "anyio\_backends\_asyncio.py", line 867, in run
  File "gradio\utils.py", line 491, in async_iteration
  File "ui\txt2img_ui.py", line 177, in txt2img_inf
  File "apps\stable_diffusion\src\pipelines\pipeline_shark_stable_diffusion_txt2img.py", line 122, in generate_images
  File "apps\stable_diffusion\src\pipelines\pipeline_shark_stable_diffusion_utils.py", line 203, in produce_img_latents
  File "apps\stable_diffusion\src\pipelines\pipeline_shark_stable_diffusion_utils.py", line 103, in load_unet
  File "apps\stable_diffusion\src\models\model_wrappers.py", line 640, in unet
  File "apps\stable_diffusion\src\models\model_wrappers.py", line 635, in unet
  File "apps\stable_diffusion\src\models\model_wrappers.py", line 59, in check_compilation
SystemExit: Could not compile Unet. Please create an issue with the detailed log at https://github.com/nod-ai/SHARK/issues
Xeniac-at commented 1 year ago

I also get this 'iree-lld: error: undefined symbol: fmaxf' message. I tried it with Windows 11 and Debian 12 (Bookworm)

(shark.venv) xeniac@debian12:~/SHARK$ python3 apps/stable_diffusion/web/index.py 
shark_tank local cache is located at /home/xeniac/.local/shark_tank/ . You may change this by setting the --local_tank_cache= flag
vulkan devices are available.
cuda devices are available.
/home/xeniac/SHARK/shark.venv/lib/python3.11/site-packages/gradio/components.py:153: UserWarning: Unknown style parameter: columns
  warnings.warn(f"Unknown style parameter: {key}")
/home/xeniac/SHARK/shark.venv/lib/python3.11/site-packages/gradio/components.py:153: UserWarning: Unknown style parameter: object_fit
  warnings.warn(f"Unknown style parameter: {key}")
/home/xeniac/SHARK/shark.venv/lib/python3.11/site-packages/diffusers/models/cross_attention.py:30: FutureWarning: Importing from cross_attention is deprecated. Please import from diffusers.models.attention_processor instead.
  deprecate(
Running on local URL:  http://0.0.0.0:8080

To create a public link, set `share=True` in `launch()`.
Tuned models are currently not supported for this setting.
No internet connection. Using the model already present in the tank.
Model artifacts for euler_scale_model_input_fp16 found at /home/xeniac/.local/shark_tank/...
No internet connection. Using the model already present in the tank.
loading existing vmfb from: /home/xeniac/SHARK/euler_scale_model_input_fp16.vmfb
No internet connection. Using the model already present in the tank.
Model artifacts for euler_step_fp16 found at /home/xeniac/.local/shark_tank/...
No internet connection. Using the model already present in the tank.
loading existing vmfb from: /home/xeniac/SHARK/euler_step_fp16.vmfb
use_tuned? sharkify: False
_1_64_512_512_fp16_stable-diffusion-2-1-base
No internet connection. Using the model already present in the tank.
Downloading artifacts for model clip_1_64_512_512_fp16_stable-diffusion-2-1-base_vulkan from: gs://shark_tank/nightly/clip_1_64_512_512_fp16_stable-diffusion-2-1-base_vulkan_torch
download pipeline failed, falling back to import_mlir
No internet connection. Using the model already present in the tank.
Downloading artifacts for model unet_1_64_512_512_fp16_stable-diffusion-2-1-base_vulkan from: gs://shark_tank/nightly/unet_1_64_512_512_fp16_stable-diffusion-2-1-base_vulkan_torch
The model data was not found. Trying to generate artifacts locally.
download pipeline failed, falling back to import_mlir
No vmfb found. Compiling and saving to /home/xeniac/SHARK/unet_1_64_512_512_fp16_stable-diffusion-2-1-base_cpu.vmfb
Configuring for device:cpu
Target triple found:x86_64-linux-gnu
Traceback (most recent call last):
  File "/home/xeniac/SHARK/shark.venv/lib/python3.11/site-packages/gradio/routes.py", line 393, in run_predict
    output = await app.get_blocks().process_api(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xeniac/SHARK/shark.venv/lib/python3.11/site-packages/gradio/blocks.py", line 1069, in process_api
    result = await self.call_function(
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xeniac/SHARK/shark.venv/lib/python3.11/site-packages/gradio/blocks.py", line 892, in call_function
    prediction = await anyio.to_thread.run_sync(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xeniac/SHARK/shark.venv/lib/python3.11/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xeniac/SHARK/shark.venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
           ^^^^^^^^^^^^
  File "/home/xeniac/SHARK/shark.venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xeniac/SHARK/shark.venv/lib/python3.11/site-packages/gradio/utils.py", line 549, in async_iteration
    return next(iterator)
           ^^^^^^^^^^^^^^
  File "/home/xeniac/SHARK/apps/stable_diffusion/web/ui/txt2img_ui.py", line 180, in txt2img_inf
    out_imgs = global_obj.get_sd_obj().generate_images(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xeniac/SHARK/apps/stable_diffusion/src/pipelines/pipeline_shark_stable_diffusion_txt2img.py", line 122, in generate_images
    latents = self.produce_img_latents(
              ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xeniac/SHARK/apps/stable_diffusion/src/pipelines/pipeline_shark_stable_diffusion_utils.py", line 203, in produce_img_latents
    self.load_unet()
  File "/home/xeniac/SHARK/apps/stable_diffusion/src/pipelines/pipeline_shark_stable_diffusion_utils.py", line 109, in load_unet
    self.unet = self.sd_model.unet()
                ^^^^^^^^^^^^^^^^^^^^
  File "/home/xeniac/SHARK/apps/stable_diffusion/src/models/model_wrappers.py", line 656, in unet
    sys.exit(e)
  File "/home/xeniac/SHARK/apps/stable_diffusion/src/models/model_wrappers.py", line 627, in unet
    compiled_unet, unet_mlir = self.compile_unet_variants(model)
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xeniac/SHARK/apps/stable_diffusion/src/models/model_wrappers.py", line 591, in compile_unet_variants
    return self.get_unet()
           ^^^^^^^^^^^^^^^
  File "/home/xeniac/SHARK/apps/stable_diffusion/src/models/model_wrappers.py", line 462, in get_unet
    shark_unet, unet_mlir = compile_through_fx(
                            ^^^^^^^^^^^^^^^^^^^
  File "/home/xeniac/SHARK/apps/stable_diffusion/src/utils/utils.py", line 162, in compile_through_fx
    _compile_module(shark_module, extended_model_name, extra_args),
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xeniac/SHARK/apps/stable_diffusion/src/utils/utils.py", line 71, in _compile_module
    path = shark_module.save_module(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xeniac/SHARK/shark/shark_inference.py", line 188, in save_module
    return export_iree_module_to_vmfb(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xeniac/SHARK/shark/iree_utils/compile_utils.py", line 352, in export_iree_module_to_vmfb
    flatbuffer_blob = compile_module_to_flatbuffer(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xeniac/SHARK/shark/iree_utils/compile_utils.py", line 280, in compile_module_to_flatbuffer
    flatbuffer_blob = ireec.compile_str(
                      ^^^^^^^^^^^^^^^^^^
  File "/home/xeniac/SHARK/shark.venv/lib/python3.11/site-packages/iree/compiler/tools/core.py", line 280, in compile_str
    result = invoke_immediate(cl, immediate_input=input_bytes)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xeniac/SHARK/shark.venv/lib/python3.11/site-packages/iree/compiler/tools/binaries.py", line 196, in invoke_immediate
    raise CompilerToolError(process)
SystemExit: Error invoking IREE compiler tool iree-compile
Diagnostics:
iree-lld: error: undefined symbol: fmaxf
>>> referenced by <eval_with_key>.3:191
>>>               /tmp/llvm_module_linked_llvm_cpu-8ca577.o:(forward_dispatch_47_softmax_2x5x4096x4096xf16)
>>> referenced by <eval_with_key>.3:191
>>>               /tmp/llvm_module_linked_llvm_cpu-8ca577.o:(forward_dispatch_47_softmax_2x5x4096x4096xf16)
>>> referenced by <eval_with_key>.3:191
>>>               /tmp/llvm_module_linked_llvm_cpu-8ca577.o:(forward_dispatch_47_softmax_2x5x4096x4096xf16)
>>> referenced 1733 more times
>>> did you mean: fmaf
>>> defined in: /tmp/llvm_module_linked_llvm_cpu-8ca577.o
Linking failed; escaped command line returned exit code 256:

LLD_VERSION=IREE /home/xeniac/SHARK/shark.venv/lib/python3.11/site-packages/iree/compiler/tools/../_mlir_libs/iree-lld -flavor gnu -o /tmp/llvm_module_linked_llvm_cpu-8ca577.so --build-id=none -nostdlib -static -shared --no-undefined --no-allow-shlib-undefined --allow-multiple-definition --gc-sections -z now -z relro --discard-all --icf=all --ignore-data-address-equality --ignore-function-address-equality --hash-style=sysv /tmp/llvm_module_linked_llvm_cpu-8ca577.o

<unknown>:0: error: failed to link executable and generate target dylib (check above for more specific error messages)
<unknown>:0: error: failed to serialize executable for target backend llvm-cpu
<unknown>:0: error: failed to serialize executables

Invoked with:
 iree-compile /home/xeniac/SHARK/shark.venv/lib/python3.11/site-packages/iree/compiler/tools/../_mlir_libs/iree-compile - --iree-input-type=tm_tensor --iree-vm-bytecode-module-output-format=flatbuffer-binary --iree-hal-target-backends=llvm-cpu --iree-llvmcpu-embedded-linker-path=/home/xeniac/SHARK/shark.venv/lib/python3.11/site-packages/iree/compiler/tools/../_mlir_libs/iree-lld --mlir-print-debuginfo --mlir-print-op-on-diagnostic=false --iree-llvmcpu-target-cpu-features=host --iree-llvmcpu-target-triple=x86_64-linux-gnu --iree-stream-resource-index-bits=64 --iree-vm-target-index-bits=64 --iree-vm-bytecode-module-strip-source-map=true --iree-util-zero-fill-elided-attrs --iree-preprocessing-pass-pipeline=builtin.module(func.func(iree-flow-detach-elementwise-from-named-ops,iree-flow-convert-1x1-filter-conv2d-to-matmul,iree-preprocessing-convert-conv2d-to-img2col,iree-preprocessing-pad-linalg-ops{pad-size=32}))

Need more information? Set IREE_SAVE_TEMPS=/some/dir in your environment to save all artifacts and reproducers.