nod-ai / SHARK

SHARK - High Performance Machine Learning Distribution
Apache License 2.0
1.41k stars 168 forks source link

Custom models fail to compile on 7900 XTX. #984

Open stonedDiscord opened 1 year ago

stonedDiscord commented 1 year ago

RX7900XTX on 23.1.2

I tried with Waifu Diffusion this time, i get the same error with OrangeMix Abyss2

vulkan devices are available.
cuda devices are not available.
Running on local URL:  http://0.0.0.0:8080

To create a public link, set `share=True` in `launch()`.
Found device AMD Radeon RX 7900 XTX. Using target triple rdna3-7900-windows.
Using tuned models for hakurei/waifu-diffusion/fp16/vulkan://00000000-0800-0000-0000-000000000000.
D:\Git\SHARK\shark.venv\lib\site-packages\torch\jit\_check.py:172: UserWarning: The TorchScript type system doesn't support instance-level annotations on empty non-base types in `__init__`. Instead, either 1) use a type annotation in the class body, or 2) wrap the type in `torch.jit.Attribute`.
  warnings.warn("The TorchScript type system doesn't support "
loading existing vmfb from: D:\Git\SHARK\apps\stable_diffusion\web\euler_scale_model_input_1_512_512fp16_vulkan-00000000-0800-0000-0000-000000000000.vmfb
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
loading existing vmfb from: D:\Git\SHARK\apps\stable_diffusion\web\euler_step_1_512_512fp16_vulkan-00000000-0800-0000-0000-000000000000.vmfb
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
Inferring base model configuration.
D:\Git\SHARK\shark.venv\lib\site-packages\torch\fx\node.py:244: UserWarning: Trying to prepend a node to itself. This behavior has no effect on the graph.
  warnings.warn("Trying to prepend a node to itself. This behavior has no effect on the graph.")
Loading Winograd config file from  C:\Users\stoned\.local/shark_tank/configs/unet_winograd_vulkan.json
100%|█████████████████████████████████████████████████████████████████████████████████| 107/107 [00:00<00:00, 6.85kB/s]
100%|█████████████████████████████████████████████████████████████████████████████████| 107/107 [00:00<00:00, 3.42kB/s]
Traceback (most recent call last):
  File "D:\Git\SHARK\apps\stable_diffusion\src\models\model_wrappers.py", line 277, in __call__
    compiled_clip, compiled_unet, compiled_vae = self.compile_all(model_id)
  File "D:\Git\SHARK\apps\stable_diffusion\src\models\model_wrappers.py", line 230, in compile_all
    compiled_unet = self.get_unet()
  File "D:\Git\SHARK\apps\stable_diffusion\src\models\model_wrappers.py", line 188, in get_unet
    shark_unet = compile_through_fx(
  File "D:\Git\SHARK\apps\stable_diffusion\src\utils\utils.py", line 103, in compile_through_fx
    mlir_module = sd_model_annotation(mlir_module, model_name)
  File "D:\Git\SHARK\apps\stable_diffusion\src\utils\sd_annotation.py", line 204, in sd_model_annotation
    lowering_config_dir = load_lower_configs()
  File "D:\Git\SHARK\apps\stable_diffusion\src\utils\sd_annotation.py", line 66, in load_lower_configs
    variant, version = get_variant_version(args.hf_model_id)
  File "D:\Git\SHARK\apps\stable_diffusion\src\models\opt_params.py", line 23, in get_variant_version
    return hf_model_variant_map[hf_model_id]
KeyError: 'hakurei/waifu-diffusion'
Retrying with a different base model configuration
Traceback (most recent call last):
  File "D:\Git\SHARK\apps\stable_diffusion\src\models\model_wrappers.py", line 277, in __call__
    compiled_clip, compiled_unet, compiled_vae = self.compile_all(model_id)
  File "D:\Git\SHARK\apps\stable_diffusion\src\models\model_wrappers.py", line 230, in compile_all
    compiled_unet = self.get_unet()
  File "D:\Git\SHARK\apps\stable_diffusion\src\models\model_wrappers.py", line 188, in get_unet
    shark_unet = compile_through_fx(
  File "D:\Git\SHARK\apps\stable_diffusion\src\utils\utils.py", line 96, in compile_through_fx
    mlir_module, func_name = import_with_fx(
  File "D:\Git\SHARK\shark\shark_importer.py", line 382, in import_with_fx
    fx_g = make_fx(
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\fx\experimental\proxy_tensor.py", line 716, in wrapped
    t = dispatch_trace(wrap_key(func, args, fx_tracer), tracer=fx_tracer, concrete_args=tuple(phs))
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\fx\experimental\proxy_tensor.py", line 450, in dispatch_trace
    graph = tracer.trace(root, concrete_args)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\fx\_symbolic_trace.py", line 778, in trace
    (self.create_arg(fn(*args)),),
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\fx\experimental\proxy_tensor.py", line 466, in wrapped
    out = f(*tensors)
  File "<string>", line 1, in <lambda>
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\fx\_symbolic_trace.py", line 756, in module_call_wrapper
    return self.call_module(mod, forward, args, kwargs)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\fx\experimental\proxy_tensor.py", line 416, in call_module
    return forward(*args, **kwargs)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\fx\_symbolic_trace.py", line 749, in forward
    return _orig_module_call(mod, *args, **kwargs)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Git\SHARK\apps\stable_diffusion\src\models\model_wrappers.py", line 175, in forward
    unet_out = self.unet.forward(
  File "D:\Git\SHARK\shark.venv\lib\site-packages\diffusers\models\unet_2d_condition.py", line 481, in forward
    sample, res_samples = downsample_block(
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\fx\_symbolic_trace.py", line 756, in module_call_wrapper
    return self.call_module(mod, forward, args, kwargs)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\fx\experimental\proxy_tensor.py", line 416, in call_module
    return forward(*args, **kwargs)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\fx\_symbolic_trace.py", line 749, in forward
    return _orig_module_call(mod, *args, **kwargs)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\diffusers\models\unet_2d_blocks.py", line 789, in forward
    hidden_states = attn(
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\fx\_symbolic_trace.py", line 756, in module_call_wrapper
    return self.call_module(mod, forward, args, kwargs)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\fx\experimental\proxy_tensor.py", line 416, in call_module
    return forward(*args, **kwargs)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\fx\_symbolic_trace.py", line 749, in forward
    return _orig_module_call(mod, *args, **kwargs)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\diffusers\models\transformer_2d.py", line 265, in forward
    hidden_states = block(
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\fx\_symbolic_trace.py", line 756, in module_call_wrapper
    return self.call_module(mod, forward, args, kwargs)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\fx\experimental\proxy_tensor.py", line 416, in call_module
    return forward(*args, **kwargs)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\fx\_symbolic_trace.py", line 749, in forward
    return _orig_module_call(mod, *args, **kwargs)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\diffusers\models\attention.py", line 307, in forward
    attn_output = self.attn2(
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\fx\_symbolic_trace.py", line 756, in module_call_wrapper
    return self.call_module(mod, forward, args, kwargs)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\fx\experimental\proxy_tensor.py", line 416, in call_module
    return forward(*args, **kwargs)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\fx\_symbolic_trace.py", line 749, in forward
    return _orig_module_call(mod, *args, **kwargs)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\diffusers\models\cross_attention.py", line 160, in forward
    return self.processor(
  File "D:\Git\SHARK\shark.venv\lib\site-packages\diffusers\models\cross_attention.py", line 234, in __call__
    key = attn.to_k(encoder_hidden_states)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\fx\_symbolic_trace.py", line 756, in module_call_wrapper
    return self.call_module(mod, forward, args, kwargs)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\fx\experimental\proxy_tensor.py", line 416, in call_module
    return forward(*args, **kwargs)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\fx\_symbolic_trace.py", line 749, in forward
    return _orig_module_call(mod, *args, **kwargs)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\nn\modules\linear.py", line 114, in forward
    return F.linear(input, self.weight, self.bias)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\utils\_stats.py", line 15, in wrapper
    return fn(*args, **kwargs)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\fx\experimental\proxy_tensor.py", line 494, in __torch_dispatch__
    return self.inner_torch_dispatch(func, types, args, kwargs)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\fx\experimental\proxy_tensor.py", line 519, in inner_torch_dispatch
    out = proxy_call(self, func, args, kwargs)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\fx\experimental\proxy_tensor.py", line 352, in proxy_call
    out = func(*args, **kwargs)
  File "D:\Git\SHARK\shark.venv\lib\site-packages\torch\_ops.py", line 284, in __call__
    return self._op(*args, **kwargs or {})
RuntimeError: mat1 and mat2 shapes cannot be multiplied (128x768 and 1024x320)
Retrying with a different base model configuration
Traceback (most recent call last):
  File "D:\Git\SHARK\shark.venv\lib\site-packages\gradio\routes.py", line 374, in run_predict
    output = await app.get_blocks().process_api(
  File "D:\Git\SHARK\shark.venv\lib\site-packages\gradio\blocks.py", line 1017, in process_api
    result = await self.call_function(
  File "D:\Git\SHARK\shark.venv\lib\site-packages\gradio\blocks.py", line 835, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "D:\Git\SHARK\shark.venv\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "D:\Git\SHARK\shark.venv\lib\site-packages\anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "D:\Git\SHARK\shark.venv\lib\site-packages\anyio\_backends\_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "D:\Git\SHARK\apps\stable_diffusion\scripts\txt2img.py", line 214, in txt2img_inf
    txt2img_obj = Text2ImagePipeline.from_pretrained(
  File "D:\Git\SHARK\apps\stable_diffusion\src\pipelines\pipeline_shark_stable_diffusion_utils.py", line 200, in from_pretrained
    clip, unet, vae = mlir_import()
  File "D:\Git\SHARK\apps\stable_diffusion\src\models\model_wrappers.py", line 293, in __call__
    sys.exit(
SystemExit: Cannot compile the model. Please re-run the command with `--enable_stack_trace` flag and create an issue with detailed log at https://github.com/nod-ai/SHARK/issues
powderluv commented 1 year ago

Thank you for the report. We tried to roll out OTF (on the fly) tuning for custom models for 7900xtx but had to revert because of this issue. We should reland soon.

FeuFeuAngel commented 1 year ago

I tried to get andite/anything-v4.0 with my mba 7900 xtx, ah sad.

shark_tank local cache is located atX.local/shark_tank/ . You may change this by setting the --local_tank_cache= flag
vulkan devices are available.
cuda devices are not available.
Running on local URL:  http://0.0.0.0:8080

To create a public link, set `share=True` in `launch()`.
Found device AMD Radeon RX 7900 XTX. Using target triple rdna3-7900-windows.
Using tuned models for andite/anything-v4.0/fp16/vulkan://00000000-0300-0000-0000-000000000000.
torch\jit\_check.py:172: UserWarning: The TorchScript type system doesn't support instance-level annotations on empty non-base types in `__init__`. Instead, either 1) use a type annotation in the class body, or 2) wrap the type in `torch.jit.Attribute`.
  warnings.warn("The TorchScript type system doesn't support "
loading existing vmfb from:XDownloads\Neuer Ordner\euler_scale_model_input_1_512_512fp16.vmfb
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
loading existing vmfb from:XDownloads\Neuer Ordner\euler_step_1_512_512fp16.vmfb
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
Inferring base model configuration.
Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with:

pip install accelerate

.
Retrying with a different base model configuration
Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with:

pip install accelerate

.
torch\fx\node.py:250: UserWarning: Trying to prepend a node to itself. This behavior has no effect on the graph.
  warnings.warn("Trying to prepend a node to itself. This behavior has no effect on the graph.")
Loading Winograd config file from X.local/shark_tank/configs/unet_winograd_vulkan.json
100%|███████████████████████████████████████████████████████████████████████████████████| 107/107 [00:00<00:00, 936B/s]
100%|███████████████████████████████████████████████████████████████████████████████████| 107/107 [00:00<00:00, 744B/s]
Retrying with a different base model configuration
Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with:

pip install accelerate

.
Retrying with a different base model configuration
Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with:

pip install accelerate

.
Retrying with a different base model configuration
Traceback (most recent call last):
  File "gradio\routes.py", line 374, in run_predict
  File "gradio\blocks.py", line 1017, in process_api
  File "gradio\blocks.py", line 835, in call_function
  File "anyio\to_thread.py", line 31, in run_sync
  File "anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread
  File "anyio\_backends\_asyncio.py", line 867, in run
  File "apps\stable_diffusion\scripts\txt2img.py", line 116, in txt2img_inf
  File "apps\stable_diffusion\src\pipelines\pipeline_shark_stable_diffusion_utils.py", line 220, in from_pretrained
  File "apps\stable_diffusion\src\models\model_wrappers.py", line 348, in __call__
SystemExit: Cannot compile the model. Please create an issue with the detailed log at https://github.com/nod-ai/SHARK/issues
powderluv commented 1 year ago

apparently it is a troll model please use the real anythingv3.

FeuFeuAngel commented 1 year ago

apparently it is a troll model please use the real anythingv3.

rly? i used OnnxDiffusersUI before and i was getting better results. Maybe it was because of the new built in KDPM2 scheduler. I am currently testing node-ai, and i am havin my difficults, since i have the feeling my prompts are cut off or not so much respected like in onxdiffusersui. (Or maybe i use to many prompts? Atleast my last prompts where they are most times not getting used, even i puted them in () )

Marormur commented 1 year ago

I just tried to use nitrosocke/Nitro-Diffusion and hakurei/waifu-diffusion. I'm using the current version (last commit was 962470f61046467c351a6cb65ab1a79fbfbd2ff2) and I also still get error messages:

(shark.venv) PS C:\Users\marvi\SHARK\apps\stable_diffusion\web> python .\index.py
shark_tank local cache is located at C:\Users\marvi\.local/shark_tank/ . You may change this by setting the --local_tank_cache= flag
vulkan devices are available.
cuda devices are not available.
Running on local URL:  http://0.0.0.0:8080

To create a public link, set `share=True` in `launch()`.
Found device AMD Radeon RX 7900 XTX. Using target triple rdna3-7900-windows.
Using tuned models for hakurei/waifu-diffusion/fp16/vulkan://00000000-0d00-0000-0000-000000000000.
C:\Users\marvi\SHARK\shark.venv\Lib\site-packages\torch\jit\_check.py:172: UserWarning: The TorchScript type system doesn't support instance-level annotations on empty non-base types in `__init__`. Instead, either 1) use a type annotation in the class body, or 2) wrap the type in `torch.jit.Attribute`.
  warnings.warn("The TorchScript type system doesn't support "
No vmfb found. Compiling and saving to C:\Users\marvi\SHARK\apps\stable_diffusion\web\euler_scale_model_input_1_512_512fp16.vmfb
Using target triple -iree-vulkan-target-triple=rdna3-7900-windows from command line args
Saved vmfb in C:\Users\marvi\SHARK\apps\stable_diffusion\web\euler_scale_model_input_1_512_512fp16.vmfb.
WARNING: [Loader Message] Code 0 : Layer name GalaxyOverlayVkLayer does not conform to naming standard (Policy #LLP_LAYER_3)
WARNING: [Loader Message] Code 0 : Layer name GalaxyOverlayVkLayer_VERBOSE does not conform to naming standard (Policy #LLP_LAYER_3)
WARNING: [Loader Message] Code 0 : Layer name GalaxyOverlayVkLayer_DEBUG does not conform to naming standard (Policy #LLP_LAYER_3)
No vmfb found. Compiling and saving to C:\Users\marvi\SHARK\apps\stable_diffusion\web\euler_step_1_512_512fp16.vmfb
Using target triple -iree-vulkan-target-triple=rdna3-7900-windows from command line args
Saved vmfb in C:\Users\marvi\SHARK\apps\stable_diffusion\web\euler_step_1_512_512fp16.vmfb.
WARNING: [Loader Message] Code 0 : Layer name GalaxyOverlayVkLayer does not conform to naming standard (Policy #LLP_LAYER_3)
WARNING: [Loader Message] Code 0 : Layer name GalaxyOverlayVkLayer_VERBOSE does not conform to naming standard (Policy #LLP_LAYER_3)
WARNING: [Loader Message] Code 0 : Layer name GalaxyOverlayVkLayer_DEBUG does not conform to naming standard (Policy #LLP_LAYER_3)
Inferring base model configuration.
Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with:

pip install accelerate

.
C:\Users\marvi\SHARK\shark.venv\Lib\site-packages\torch\fx\node.py:250: UserWarning: Trying to prepend a node to itself. This behavior has no effect on the graph.
  warnings.warn("Trying to prepend a node to itself. This behavior has no effect on the graph.")
Loading Winograd config file from  C:\Users\marvi\.local/shark_tank/configs/unet_winograd_vulkan.json
100%|█████████████████████████████████████████████████████████| 107/107 [00:00<00:00, 842B/s]
100%|███████████████████████████████████████████████████████| 107/107 [00:00<00:00, 8.91kB/s]
Retrying with a different base model configuration
Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with:

pip install accelerate

.
Retrying with a different base model configuration
Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with:

pip install accelerate

.
Retrying with a different base model configuration
Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with:

pip install accelerate

.
Retrying with a different base model configuration
Traceback (most recent call last):
  File "C:\Users\marvi\SHARK\shark.venv\Lib\site-packages\gradio\routes.py", line 374, in run_predict
    output = await app.get_blocks().process_api(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\marvi\SHARK\shark.venv\Lib\site-packages\gradio\blocks.py", line 1017, in process_api
    result = await self.call_function(
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\marvi\SHARK\shark.venv\Lib\site-packages\gradio\blocks.py", line 835, in call_function
    prediction = await anyio.to_thread.run_sync(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\marvi\SHARK\shark.venv\Lib\site-packages\anyio\to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\marvi\SHARK\shark.venv\Lib\site-packages\anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
           ^^^^^^^^^^^^
  File "C:\Users\marvi\SHARK\shark.venv\Lib\site-packages\anyio\_backends\_asyncio.py", line 867, in run
    result = context.run(func, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\marvi\SHARK\apps\stable_diffusion\scripts\txt2img.py", line 116, in txt2img_inf
    txt2img_obj = Text2ImagePipeline.from_pretrained(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\marvi\SHARK\apps\stable_diffusion\src\pipelines\pipeline_shark_stable_diffusion_utils.py", line 223, in from_pretrained
    clip, unet, vae = mlir_import()
                      ^^^^^^^^^^^^^
  File "C:\Users\marvi\SHARK\apps\stable_diffusion\src\models\model_wrappers.py", line 383, in __call__
    sys.exit(
SystemExit: Cannot compile the model. Please create an issue with the detailed log at https://github.com/nod-ai/SHARK/issues
Found device AMD Radeon RX 7900 XTX. Using target triple rdna3-7900-windows.
Using tuned models for nitrosocke/Nitro-Diffusion/fp16/vulkan://00000000-0d00-0000-0000-000000000000.
loading existing vmfb from: C:\Users\marvi\SHARK\apps\stable_diffusion\web\euler_scale_model_input_1_512_512fp16.vmfb
WARNING: [Loader Message] Code 0 : Layer name GalaxyOverlayVkLayer does not conform to naming standard (Policy #LLP_LAYER_3)
WARNING: [Loader Message] Code 0 : Layer name GalaxyOverlayVkLayer_VERBOSE does not conform to naming standard (Policy #LLP_LAYER_3)
WARNING: [Loader Message] Code 0 : Layer name GalaxyOverlayVkLayer_DEBUG does not conform to naming standard (Policy #LLP_LAYER_3)
loading existing vmfb from: C:\Users\marvi\SHARK\apps\stable_diffusion\web\euler_step_1_512_512fp16.vmfb
WARNING: [Loader Message] Code 0 : Layer name GalaxyOverlayVkLayer does not conform to naming standard (Policy #LLP_LAYER_3)
WARNING: [Loader Message] Code 0 : Layer name GalaxyOverlayVkLayer_VERBOSE does not conform to naming standard (Policy #LLP_LAYER_3)
WARNING: [Loader Message] Code 0 : Layer name GalaxyOverlayVkLayer_DEBUG does not conform to naming standard (Policy #LLP_LAYER_3)
Inferring base model configuration.
Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with:

pip install accelerate

.
Retrying with a different base model configuration
Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with:

pip install accelerate

.
Loading Winograd config file from  C:\Users\marvi\.local/shark_tank/configs/unet_winograd_vulkan.json
100%|███████████████████████████████████████████████████████| 107/107 [00:00<00:00, 9.73kB/s]
100%|███████████████████████████████████████████████████████| 107/107 [00:00<00:00, 9.72kB/s]
Retrying with a different base model configuration
Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with:

pip install accelerate

.
Retrying with a different base model configuration
Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with:

pip install accelerate

.
Retrying with a different base model configuration
Traceback (most recent call last):
  File "C:\Users\marvi\SHARK\shark.venv\Lib\site-packages\gradio\routes.py", line 374, in run_predict
    output = await app.get_blocks().process_api(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\marvi\SHARK\shark.venv\Lib\site-packages\gradio\blocks.py", line 1017, in process_api
    result = await self.call_function(
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\marvi\SHARK\shark.venv\Lib\site-packages\gradio\blocks.py", line 835, in call_function
    prediction = await anyio.to_thread.run_sync(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\marvi\SHARK\shark.venv\Lib\site-packages\anyio\to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\marvi\SHARK\shark.venv\Lib\site-packages\anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
           ^^^^^^^^^^^^
  File "C:\Users\marvi\SHARK\shark.venv\Lib\site-packages\anyio\_backends\_asyncio.py", line 867, in run
    result = context.run(func, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\marvi\SHARK\apps\stable_diffusion\scripts\txt2img.py", line 116, in txt2img_inf
    txt2img_obj = Text2ImagePipeline.from_pretrained(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\marvi\SHARK\apps\stable_diffusion\src\pipelines\pipeline_shark_stable_diffusion_utils.py", line 223, in from_pretrained
    clip, unet, vae = mlir_import()
                      ^^^^^^^^^^^^^
  File "C:\Users\marvi\SHARK\apps\stable_diffusion\src\models\model_wrappers.py", line 383, in __call__
    sys.exit(
SystemExit: Cannot compile the model. Please create an issue with the detailed log at https://github.com/nod-ai/SHARK/issues