nod-ai / SHARK

SHARK - High Performance Machine Learning Distribution
Apache License 2.0
1.4k stars 170 forks source link

Could not compile Unet. #1431

Open Xindaris opened 1 year ago

Xindaris commented 1 year ago

Windows 10, with AMD Radeon RX 6600.

Version 524 of the exe is functional. Version 700, and (because I tried it in hopes of a slightly older version still working) 693, both go through the exact same long-winded series of errors and retries below. It also looks like they're trying to read something from GOG Galaxy, of all things, and still try that even though I deleted those files. Using the "--clear_all" flag did not seem to change or fix anything.

shark_sd_20230419_693.exe --clear_all shark_tank local cache is located at C:\Users\Xindaris.local/shark_tank/ . You may change this by setting the --local_tank_cache= flag CLEARING ALL, EXPECT SEVERAL MINUTES TO RECOMPILE vulkan devices are available. cuda devices are not available. diffusers\models\cross_attention.py:30: FutureWarning: Importing from cross_attention is deprecated. Please import from diffusers.models.attention_processor instead. Running on local URL: http://0.0.0.0:8080

To create a public link, set share=True in launch(). Found device AMD Radeon RX 6600. Using target triple rdna2-unknown-windows. Using tuned models for Linaqruf/anything-v3.0/fp16/vulkan://00000000-0300-0000-0000-000000000000. No vmfb found. Compiling and saving to D:\Stable Diffusion\SHARK exe\euler_scale_model_input_1_512_512fp16.vmfb Using target triple -iree-vulkan-target-triple=rdna2-unknown-windows from command line args Saved vmfb in D:\Stable Diffusion\SHARK exe\euler_scale_model_input_1_512_512fp16.vmfb. ERROR: [Loader Message] Code 0 : loader_get_json: Failed to open JSON file C:\ProgramData\GOG.com\Galaxy\redists\overlay\injected\galaxy_overlay_vklayer_x64.json WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files. No vmfb found. Compiling and saving to D:\Stable Diffusion\SHARK exe\euler_step_1_512_512fp16.vmfb Using target triple -iree-vulkan-target-triple=rdna2-unknown-windows from command line args Saved vmfb in D:\Stable Diffusion\SHARK exe\euler_step_1_512_512fp16.vmfb. ERROR: [Loader Message] Code 0 : loader_get_json: Failed to open JSON file C:\ProgramData\GOG.com\Galaxy\redists\overlay\injected\galaxy_overlay_vklayer_x64.json WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files. use_tuned? sharkify: True _1_77_512_512_fp16_tuned_anything-v3 No vmfb found. Compiling and saving to D:\Stable Diffusion\SHARK exe\clip_1_77_512_512_fp16_tuned_anything-v3_vulkan.vmfb Using target triple -iree-vulkan-target-triple=rdna2-unknown-windows from command line args Saved vmfb in D:\Stable Diffusion\SHARK exe\clip_1_77_512_512_fp16_tuned_anything-v3_vulkan.vmfb. ERROR: [Loader Message] Code 0 : loader_get_json: Failed to open JSON file C:\ProgramData\GOG.com\Galaxy\redists\overlay\injected\galaxy_overlay_vklayer_x64.json WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files. mat1 and mat2 shapes cannot be multiplied (154x1024 and 768x320) Retrying with a different base model configuration Loading Winograd config file from C:\Users\Xindaris.local/shark_tank/configs\unet_winograd_vulkan.json 100%|█████████████████████████████████████████████████████████████████████████████████| 107/107 [00:00<00:00, 2.82kB/s] 100%|█████████████████████████████████████████████████████████████████████████████████| 107/107 [00:00<00:00, 4.28kB/s] Loading lowering config file from C:\Users\Xindaris.local/shark_tank/configs\unet_v1_4_fp16_vulkan_rdna2.json 100%|██████████████████████████████████████████████████████████████████████████████| 24.6k/24.6k [00:00<00:00, 532kB/s] 100%|██████████████████████████████████████████████████████████████████████████████| 24.6k/24.6k [00:00<00:00, 612kB/s] Applying tuned configs on unet_1_77_512_512_fp16_tuned_anything-v3_vulkan

Retrying with a different base model configuration Given groups=1, weight of size [320, 4, 3, 3], expected input[2, 9, 64, 64] to have 4 channels, but got 9 channels instead Retrying with a different base model configuration Given groups=1, weight of size [320, 4, 3, 3], expected input[2, 9, 64, 64] to have 4 channels, but got 9 channels instead Retrying with a different base model configuration Given groups=1, weight of size [320, 4, 3, 3], expected input[4, 7, 512, 512] to have 4 channels, but got 7 channels instead Retrying with a different base model configuration Traceback (most recent call last): File "gradio\routes.py", line 401, in run_predict File "gradio\blocks.py", line 1302, in process_api File "gradio\blocks.py", line 1039, in call_function File "anyio\to_thread.py", line 31, in run_sync File "anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread File "anyio_backends_asyncio.py", line 867, in run File "gradio\utils.py", line 491, in async_iteration File "ui\txt2img_ui.py", line 173, in txt2img_inf File "apps\stable_diffusion\src\pipelines\pipeline_shark_stable_diffusion_txt2img.py", line 122, in generate_images File "apps\stable_diffusion\src\pipelines\pipeline_shark_stable_diffusion_utils.py", line 203, in produce_img_latents File "apps\stable_diffusion\src\pipelines\pipeline_shark_stable_diffusion_utils.py", line 103, in load_unet File "apps\stable_diffusion\src\models\model_wrappers.py", line 640, in unet File "apps\stable_diffusion\src\models\model_wrappers.py", line 635, in unet File "apps\stable_diffusion\src\models\model_wrappers.py", line 59, in check_compilation SystemExit: Could not compile Unet. Please create an issue with the detailed log at https://github.com/nod-ai/SHARK/issues

Nughu commented 1 year ago

same issue here.

Nughu commented 1 year ago

here's my log:

_Using F:\AI as local shark_tank cache directory. vulkan devices are available. cuda devices are not available. diffusers\models\cross_attention.py:30: FutureWarning: Importing from cross_attention is deprecated. Please import from diffusers.models.attention_processor instead. Running on local URL: http://0.0.0.0:8080

To create a public link, set share=True in launch(). Found device AMD Radeon RX 6700 XT. Using target triple rdna2-unknown-windows. Using tuned models for stabilityai/stable-diffusion-2-1/fp16/vulkan://00000000-2f00-0000-0000-000000000000. No vmfb found. Compiling and saving to F:\AI\SHARK\euler_scale_model_input_1_512_512_vulkan_fp16.vmfb Using target triple -iree-vulkan-target-triple=rdna2-unknown-windows from command line args Saved vmfb in F:\AI\SHARK\euler_scale_model_input_1_512_512_vulkan_fp16.vmfb. WARNING: [Loader Message] Code 0 : Layer name GalaxyOverlayVkLayer does not conform to naming standard (Policy #LLP_LAYER_3) WARNING: [Loader Message] Code 0 : Layer name GalaxyOverlayVkLayer_VERBOSE does not conform to naming standard (Policy #LLP_LAYER_3) WARNING: [Loader Message] Code 0 : Layer name GalaxyOverlayVkLayer_DEBUG does not conform to naming standard (Policy #LLP_LAYER_3) ERROR: [Loader Message] Code 0 : loader_get_json: Failed to open JSON file C:\ProgramData\obs-studio-hook\obs-vulkan64.json No vmfb found. Compiling and saving to F:\AI\SHARK\euler_step_1_512_512_vulkan_fp16.vmfb Using target triple -iree-vulkan-target-triple=rdna2-unknown-windows from command line args Saved vmfb in F:\AI\SHARK\euler_step_1_512_512_vulkan_fp16.vmfb. WARNING: [Loader Message] Code 0 : Layer name GalaxyOverlayVkLayer does not conform to naming standard (Policy #LLP_LAYER_3) WARNING: [Loader Message] Code 0 : Layer name GalaxyOverlayVkLayer_VERBOSE does not conform to naming standard (Policy #LLP_LAYER_3) WARNING: [Loader Message] Code 0 : Layer name GalaxyOverlayVkLayer_DEBUG does not conform to naming standard (Policy #LLP_LAYER_3) ERROR: [Loader Message] Code 0 : loader_get_json: Failed to open JSON file C:\ProgramData\obs-studio-hook\obs-vulkan64.json use_tuned? sharkify: True _1_64_512_512_fp16_tuned_stable-diffusion-2-1-base transformers\modeling_utils.py:429: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() with safe_open(checkpoint_file, framework="pt") as f: No vmfb found. Compiling and saving to F:\AI\SHARK\clip_1_64_512_512_fp16_tuned_stable-diffusion-2-1-base_vulkan.vmfb Using target triple -iree-vulkan-target-triple=rdna2-unknown-windows from command line args Saved vmfb in F:\AI\SHARK\clip_1_64_512_512_fp16_tuned_stable-diffusion-2-1-base_vulkan.vmfb. WARNING: [Loader Message] Code 0 : Layer name GalaxyOverlayVkLayer does not conform to naming standard (Policy #LLP_LAYER_3) WARNING: [Loader Message] Code 0 : Layer name GalaxyOverlayVkLayer_VERBOSE does not conform to naming standard (Policy #LLP_LAYER_3) WARNING: [Loader Message] Code 0 : Layer name GalaxyOverlayVkLayer_DEBUG does not conform to naming standard (Policy #LLP_LAYER_3) ERROR: [Loader Message] Code 0 : loader_get_json: Failed to open JSON file C:\ProgramData\obs-studio-hook\obs-vulkan64.json --what,why, how? Loading Winograd config file from F:\AI\configs\unet_winograd_vulkan.json 100%|███████████████████████████████████████████████████████████████████████████████████| 107/107 [00:00<00:00, 841B/s] 100%|█████████████████████████████████████████████████████████████████████████████████| 107/107 [00:00<00:00, 6.36kB/s] Loading lowering config file from F:\AI\configs\unet_v2_1base_fp16_vulkan_rdna2.json 100%|██████████████████████████████████████████████████████████████████████████████| 24.2k/24.2k [00:00<00:00, 189kB/s] 100%|█████████████████████████████████████████████████████████████████████████████| 24.2k/24.2k [00:00<00:00, 1.08MB/s] Applying tuned configs on unet_1_64_512_512_fp16_tuned_stable-diffusion-2-1-base_vulkan

Retrying with a different base model configuration mat1 and mat2 shapes cannot be multiplied (128x768 and 1024x320) Retrying with a different base model configuration Given groups=1, weight of size [320, 4, 3, 3], expected input[2, 9, 64, 64] to have 4 channels, but got 9 channels instead Retrying with a different base model configuration Given groups=1, weight of size [320, 4, 3, 3], expected input[2, 9, 64, 64] to have 4 channels, but got 9 channels instead Retrying with a different base model configuration Given groups=1, weight of size [320, 4, 3, 3], expected input[4, 7, 512, 512] to have 4 channels, but got 7 channels instead Retrying with a different base model configuration Traceback (most recent call last): File "gradio\routes.py", line 401, in run_predict File "gradio\blocks.py", line 1302, in process_api File "gradio\blocks.py", line 1039, in call_function File "anyio\to_thread.py", line 31, in run_sync File "anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread File "anyio_backends_asyncio.py", line 867, in run File "gradio\utils.py", line 491, in async_iteration File "ui\txt2img_ui.py", line 177, in txt2img_inf File "apps\stable_diffusion\src\pipelines\pipeline_shark_stable_diffusion_txt2img.py", line 122, in generate_images File "apps\stable_diffusion\src\pipelines\pipeline_shark_stable_diffusion_utils.py", line 203, in produce_img_latents File "apps\stable_diffusion\src\pipelines\pipeline_shark_stable_diffusion_utils.py", line 103, in load_unet File "apps\stable_diffusion\src\models\model_wrappers.py", line 640, in unet File "apps\stable_diffusion\src\models\model_wrappers.py", line 635, in unet File "apps\stable_diffusion\src\models\model_wrappers.py", line 59, in checkcompilation SystemExit: Could not compile Unet. Please create an issue with the detailed log at https://github.com/nod-ai/SHARK/issues

Nughu commented 1 year ago

"ERROR: [Loader Message] Code 0 : loader_get_json: Failed to open JSON file C:\ProgramData\obs-studio-hook\obs-vulkan64.json"

why is this happening and how can I fix it?

Nughu commented 1 year ago

Release 20230213_524.exe is also working for me.

RestleSSOtaKU commented 1 year ago

same here on version 20230423_700

shark_tank local cache is located at C:\Users\Max.local/shark_tank/ . You may change this by setting the --local_tank_cache= flag vulkan devices are available. cuda devices are not available. diffusers\models\cross_attention.py:30: FutureWarning: Importing from cross_attention is deprecated. Please import from diffusers.models.attention_processor instead. Running on local URL: http://0.0.0.0:8080

To create a public link, set share=True in launch(). Found device AMD Radeon RX 6600. Using target triple rdna2-unknown-windows. Using tuned models for Linaqruf/anything-v3.0/fp16/vulkan://00000000-0300-0000-0000-000000000000. Downloading (…)cheduler_config.json: 100%|████████████████████████████████████████████████████| 341/341 [00:00<?, ?B/s] loading existing vmfb from: G:\SD\euler_scale_model_input_1_512_512_vulkan_fp16.vmfb loading existing vmfb from: G:\SD\euler_step_1_512_512_vulkan_fp16.vmfb use_tuned? sharkify: True _1_77_512_512_fp16_tuned_anything-v3 Downloading (…)tokenizer/vocab.json: 100%|████████████████████████████████████████████| 1.06M/1.06M [00:00<00:00, 2.12MB/s] Downloading (…)tokenizer/merges.txt: 100%|██████████████████████████████████████████████| 525k/525k [00:00<00:00, 1.32MB/s] Downloading (…)cial_tokens_map.json: 100%|█████████████████████████████████████████████████| 472/472 [00:00<00:00, 477kB/s] Downloading (…)okenizer_config.json: 100%|████████████████████████████████████████████████████████| 807/807 [00:00<?, ?B/s] Downloading (…)_encoder/config.json: 100%|█████████████████████████████████████████████████| 612/612 [00:00<00:00, 611kB/s] Downloading pytorch_model.bin: 100%|████████████████████████████████████████████████████| 492M/492M [02:11<00:00, 3.75MB/s] No vmfb found. Compiling and saving to G:\SD\clip_1_77_512_512_fp16_tuned_anything-v3_vulkan.vmfb Using target triple -iree-vulkan-target-triple=rdna2-unknown-windows from command line args Saved vmfb in G:\SD\clip_1_77_512_512_fp16_tuned_anything-v3_vulkan.vmfb. Downloading (…)ain/unet/config.json: 100%|█████████████████████████████████████████████████| 901/901 [00:00<00:00, 899kB/s] Downloading (…)on_pytorch_model.bin: 100%|████████████████████████████████████████████| 3.44G/3.44G [11:32<00:00, 4.97MB/s] mat1 and mat2 shapes cannot be multiplied (154x1024 and 768x320) Retrying with a different base model configuration [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 1073741824 bytes. Retrying with a different base model configuration

Retrying with a different base model configuration

Retrying with a different base model configuration

Retrying with a different base model configuration Traceback (most recent call last): File "gradio\routes.py", line 401, in run_predict File "gradio\blocks.py", line 1302, in process_api File "gradio\blocks.py", line 1039, in call_function File "anyio\to_thread.py", line 31, in run_sync File "anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread File "anyio_backends_asyncio.py", line 867, in run File "gradio\utils.py", line 491, in async_iteration File "ui\txt2img_ui.py", line 177, in txt2img_inf File "apps\stable_diffusion\src\pipelines\pipeline_shark_stable_diffusion_txt2img.py", line 122, in generate_images File "apps\stable_diffusion\src\pipelines\pipeline_shark_stable_diffusion_utils.py", line 203, in produce_img_latents File "apps\stable_diffusion\src\pipelines\pipeline_shark_stable_diffusion_utils.py", line 103, in load_unet File "apps\stable_diffusion\src\models\model_wrappers.py", line 640, in unet File "apps\stable_diffusion\src\models\model_wrappers.py", line 635, in unet File "apps\stable_diffusion\src\models\model_wrappers.py", line 59, in check_compilation SystemExit: Could not compile Unet. Please create an issue with the detailed log at https://github.com/nod-ai/SHARK/issues

RestleSSOtaKU commented 1 year ago

Release 20230213_524.exe is also working for me.

I have issue with 20230213524.exe as well ^^

shark_tank local cache is located at C:\Users\Max.local/shark_tank/ . You may change this by setting the --local_tank_cache= flag vulkan devices are available. cuda devices are not available. Running on local URL: http://0.0.0.0:8080

To create a public link, set share=True in launch(). Found device AMD Radeon RX 6600. Using target triple rdna2-unknown-windows. Tuned models are currently not supported for this setting. Downloading (…)cheduler_config.json: 100%|█████████████████████████████████████████████| 341/341 [00:00<00:00, 341kB/s] Downloading artifacts for model euler_scale_model_input_fp16... 100%|█████████████████████████████████████████████████████████████████████████████| 1.08k/1.08k [00:00<00:00, 3.06kB/s] 100%|███████████████████████████████████████████████████████████████████████████████████| 156/156 [00:00<00:00, 275B/s] 100%|█████████████████████████████████████████████████████████████████████████████| 32.3k/32.3k [00:00<00:00, 42.7kB/s] 100%|█████████████████████████████████████████████████████████████████████████████████| 640/640 [00:00<00:00, 1.26kB/s] 100%|█████████████████████████████████████████████████████████████████████████████| 32.5k/32.5k [00:00<00:00, 48.2kB/s] No vmfb found. Compiling and saving to G:\SD2\euler_scale_model_input_fp16.vmfb Using target triple -iree-vulkan-target-triple=rdna2-unknown-windows from command line args Saved vmfb in G:\SD2\euler_scale_model_input_fp16.vmfb. Downloading artifacts for model euler_step_fp16... 100%|█████████████████████████████████████████████████████████████████████████████| 1.09k/1.09k [00:00<00:00, 2.51kB/s] 100%|███████████████████████████████████████████████████████████████████████████████████| 156/156 [00:00<00:00, 244B/s] 100%|█████████████████████████████████████████████████████████████████████████████| 32.3k/32.3k [00:00<00:00, 47.1kB/s] 100%|█████████████████████████████████████████████████████████████████████████████████| 640/640 [00:00<00:00, 1.11kB/s] 100%|█████████████████████████████████████████████████████████████████████████████| 65.0k/65.0k [00:00<00:00, 85.0kB/s] No vmfb found. Compiling and saving to G:\SD2\euler_step_fp16.vmfb Using target triple -iree-vulkan-target-triple=rdna2-unknown-windows from command line args Saved vmfb in G:\SD2\euler_step_fp16.vmfb. Downloading artifacts for model av3_vae_19dec_fp16... 100%|███████████████████████████████████████████████████████████████████████████████| 189M/189M [00:50<00:00, 3.96MB/s] 100%|███████████████████████████████████████████████████████████████████████████████████| 156/156 [00:00<00:00, 280B/s] 100%|█████████████████████████████████████████████████████████████████████████████| 1.50M/1.50M [00:01<00:00, 1.47MB/s] 100%|█████████████████████████████████████████████████████████████████████████████████| 640/640 [00:00<00:00, 1.21kB/s] 100%|█████████████████████████████████████████████████████████████████████████████| 32.3k/32.3k [00:00<00:00, 51.1kB/s] No vmfb found. Compiling and saving to G:\SD2\av3_vae_19dec_fp16.vmfb Using target triple -iree-vulkan-target-triple=rdna2-unknown-windows from command line args Saved vmfb in G:\SD2\av3_vae_19dec_fp16.vmfb. Downloading artifacts for model av3_clip_19dec_fp32... 100%|███████████████████████████████████████████████████████████████████████████████| 939M/939M [03:59<00:00, 4.11MB/s] 100%|███████████████████████████████████████████████████████████████████████████████████| 156/156 [00:00<00:00, 388B/s] 100%|████████████████████████████████████████████████████████████████████████████████| 462k/462k [00:00<00:00, 865kB/s] 100%|█████████████████████████████████████████████████████████████████████████████████| 640/640 [00:00<00:00, 1.26kB/s] 100%|█████████████████████████████████████████████████████████████████████████████| 1.46k/1.46k [00:00<00:00, 3.13kB/s] No vmfb found. Compiling and saving to G:\SD2\av3_clip_19dec_fp32.vmfb Using target triple -iree-vulkan-target-triple=rdna2-unknown-windows from command line args Saved vmfb in G:\SD2\av3_clip_19dec_fp32.vmfb. Downloading artifacts for model av3_unet_19dec_fp16... 100%|█████████████████████████████████████████████████████████████████████████████| 3.20G/3.20G [13:10<00:00, 4.35MB/s] 100%|███████████████████████████████████████████████████████████████████████████████████| 156/156 [00:00<00:00, 246B/s] 100%|█████████████████████████████████████████████████████████████████████████████| 32.3k/32.3k [00:00<00:00, 51.2kB/s] 100%|█████████████████████████████████████████████████████████████████████████████████| 640/640 [00:00<00:00, 1.52kB/s] 100%|████████████████████████████████████████████████████████████████████████████████| 264k/264k [00:00<00:00, 481kB/s] No vmfb found. Compiling and saving to G:\SD2\av3_unet_19dec_fp16.vmfb Using target triple -iree-vulkan-target-triple=rdna2-unknown-windows from command line args Traceback (most recent call last): File "gradio\routes.py", line 374, in run_predict File "gradio\blocks.py", line 1017, in process_api File "gradio\blocks.py", line 835, in call_function File "anyio\to_thread.py", line 31, in run_sync File "anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread File "anyio_backends_asyncio.py", line 867, in run File "apps\stable_diffusion\scripts\txt2img.py", line 214, in txt2img_inf File "apps\stable_diffusion\src\pipelines\pipeline_shark_stable_diffusion_utils.py", line 233, in from_pretrained File "apps\stable_diffusion\src\models\opt_params.py", line 54, in get_unet File "apps\stable_diffusion\src\utils\utils.py", line 82, in get_shark_model File "apps\stable_diffusion\src\utils\utils.py", line 53, in _compile_module File "shark\shark_inference.py", line 188, in save_module File "shark\iree_utils\compile_utils.py", line 342, in export_iree_module_to_vmfb File "shark\iree_utils\compile_utils.py", line 279, in compile_module_to_flatbuffer File "iree\compiler\tools\core.py", line 280, in compile_str File "iree\compiler\tools\binaries.py", line 198, in invoke_immediate iree.compiler.tools.binaries.CompilerToolError: Error invoking IREE compiler tool iree-compile.exe Diagnostics: LLVM ERROR: out of memory Allocation failed Please report issues to https://github.com/iree-org/iree/issues and include the crash backtrace. Stack dump:

  1. Program arguments: C:\windows\temp\_MEI68842\iree\compiler\tools\..\_mlir_libs\iree-compile.exe - --iree-input-type=none --iree-vm-bytecode-module-output-format=flatbuffer-binary --iree-hal-target-backends=vulkan --iree-llvm-embedded-linker-path=C:\windows\temp\_MEI68842\iree\compiler\tools\..\_mlir_libs\iree-lld.exe --mlir-print-debuginfo --mlir-print-op-on-diagnostic=false --iree-llvm-target-cpu-features=host "--iree-vulkan-target-env=#vk.target_env<v1.3, r(120), [VK_KHR_16bit_storage, VK_KHR_8bit_storage, VK_KHR_shader_float16_int8, VK_KHR_spirv_1_4, VK_KHR_storage_buffer_storage_class, VK_KHR_variable_pointers, VK_EXT_subgroup_size_control], AMD:DiscreteGPU, #vk.caps< maxComputeSharedMemorySize = 65536, maxComputeWorkGroupInvocations = 1024, maxComputeWorkGroupSize = dense<[1024, 1024, 1024]>: vector<3xi32>, subgroupSize = 64, subgroupFeatures = 255: i32, minSubgroupSize = 32, maxSubgroupSize = 64, shaderFloat16 = unit, shaderFloat64 = unit, shaderInt8 = unit, shaderInt16 = unit, shaderInt64 = unit, storageBuffer16BitAccess = unit, storagePushConstant16 = unit, uniformAndStorageBuffer16BitAccess = unit, storageBuffer8BitAccess = unit, storagePushConstant8 = unit, uniformAndStorageBuffer8BitAccess = unit, variablePointers = unit, variablePointersStorageBuffer = unit >>" --iree-stream-resource-index-bits=64 --iree-vm-target-index-bits=64 --iree-util-zero-fill-elided-attrs -iree-vulkan-target-triple=rdna2-unknown-windows --iree-preprocessing-pass-pipeline=builtin.module(func.func(iree-flow-detach-elementwise-from-named-ops,iree-flow-convert-1x1-filter-conv2d-to-matmul,iree-preprocessing-convert-conv2d-to-img2col,iree-preprocessing-pad-linalg-ops{pad-size=32})) Exception Code: 0x80000003 0x00007FFCC1248165, C:\windows\temp_MEI68842\iree\compiler_mlir_libs\IREECompiler.dll(0x00007FFCBD6A0000) + 0x3BA8165 byte(s), mlirTypeIDHashValue() + 0x37F0EB5 byte(s) 0x00007FFCE9611881, C:\windows\System32\ucrtbase.dll(0x00007FFCE95A0000) + 0x71881 byte(s), raise() + 0x1E1 byte(s) 0x00007FFCE9612851, C:\windows\System32\ucrtbase.dll(0x00007FFCE95A0000) + 0x72851 byte(s), abort() + 0x31 byte(s) 0x00007FFCC11BC7E5, C:\windows\temp_MEI68842\iree\compiler_mlir_libs\IREECompiler.dll(0x00007FFCBD6A0000) + 0x3B1C7E5 byte(s), mlirTypeIDHashValue() + 0x3765535 byte(s) 0x00007FFCC11A7A50, C:\windows\temp_MEI68842\iree\compiler_mlir_libs\IREECompiler.dll(0x00007FFCBD6A0000) + 0x3B07A50 byte(s), mlirTypeIDHashValue() + 0x37507A0 byte(s) 0x00007FFCC11D7778, C:\windows\temp_MEI68842\iree\compiler_mlir_libs\IREECompiler.dll(0x00007FFCBD6A0000) + 0x3B37778 byte(s), mlirTypeIDHashValue() + 0x37804C8 byte(s) 0x00007FFCC11BD868, C:\windows\temp_MEI68842\iree\compiler_mlir_libs\IREECompiler.dll(0x00007FFCBD6A0000) + 0x3B1D868 byte(s), mlirTypeIDHashValue() + 0x37665B8 byte(s) 0x00007FFCC11BD52D, C:\windows\temp_MEI68842\iree\compiler_mlir_libs\IREECompiler.dll(0x00007FFCBD6A0000) + 0x3B1D52D byte(s), mlirTypeIDHashValue() + 0x376627D byte(s) 0x00007FFCC11A5336, C:\windows\temp_MEI68842\iree\compiler_mlir_libs\IREECompiler.dll(0x00007FFCBD6A0000) + 0x3B05336 byte(s), mlirTypeIDHashValue() + 0x374E086 byte(s) 0x00007FFCC11A52D6, C:\windows\temp_MEI68842\iree\compiler_mlir_libs\IREECompiler.dll(0x00007FFCBD6A0000) + 0x3B052D6 byte(s), mlirTypeIDHashValue() + 0x374E026 byte(s) 0x00007FFCBD73C9C6, C:\windows\temp_MEI68842\iree\compiler_mlir_libs\IREECompiler.dll(0x00007FFCBD6A0000) + 0x9C9C6 byte(s), ireeCompilerSourceOpenFile() + 0xB6 byte(s) 0x00007FFCBD72DA39, C:\windows\temp_MEI68842\iree\compiler_mlir_libs\IREECompiler.dll(0x00007FFCBD6A0000) + 0x8DA39 byte(s), mlirBlockArgumentGetArgNumber() + 0x1D89 byte(s) 0x00007FF64C8311FC, C:\windows\temp_MEI68842\iree\compiler_mlir_libs\iree-compile.exe(0x00007FF64C830000) + 0x11FC byte(s) 0x00007FFCEB427034, C:\windows\System32\KERNEL32.DLL(0x00007FFCEB410000) + 0x17034 byte(s), BaseThreadInitThunk() + 0x14 byte(s) 0x00007FFCEBD42651, C:\windows\SYSTEM32\ntdll.dll(0x00007FFCEBCF0000) + 0x52651 byte(s), RtlUserThreadStart() + 0x21 byte(s)

Invoked with: iree-compile.exe C:\windows\temp_MEI68842\iree\compiler\tools.._mlir_libs\iree-compile.exe - --iree-input-type=none --iree-vm-bytecode-module-output-format=flatbuffer-binary --iree-hal-target-backends=vulkan --iree-llvm-embedded-linker-path=C:\windows\temp_MEI68842\iree\compiler\tools.._mlir_libs\iree-lld.exe --mlir-print-debuginfo --mlir-print-op-on-diagnostic=false --iree-llvm-target-cpu-features=host --iree-vulkan-target-env=#vk.target_env<v1.3, r(120), [VK_KHR_16bit_storage, VK_KHR_8bit_storage, VK_KHR_shader_float16_int8, VK_KHR_spirv_1_4, VK_KHR_storage_buffer_storage_class, VK_KHR_variable_pointers, VK_EXT_subgroup_size_control], AMD:DiscreteGPU, #vk.caps< maxComputeSharedMemorySize = 65536, maxComputeWorkGroupInvocations = 1024, maxComputeWorkGroupSize = dense<[1024, 1024, 1024]>: vector<3xi32>, subgroupSize = 64, subgroupFeatures = 255: i32, minSubgroupSize = 32, maxSubgroupSize = 64, shaderFloat16 = unit, shaderFloat64 = unit, shaderInt8 = unit, shaderInt16 = unit, shaderInt64 = unit, storageBuffer16BitAccess = unit, storagePushConstant16 = unit, uniformAndStorageBuffer16BitAccess = unit, storageBuffer8BitAccess = unit, storagePushConstant8 = unit, uniformAndStorageBuffer8BitAccess = unit, variablePointers = unit, variablePointersStorageBuffer = unit >> --iree-stream-resource-index-bits=64 --iree-vm-target-index-bits=64 --iree-util-zero-fill-elided-attrs -iree-vulkan-target-triple=rdna2-unknown-windows --iree-preprocessing-pass-pipeline=builtin.module(func.func(iree-flow-detach-elementwise-from-named-ops,iree-flow-convert-1x1-filter-conv2d-to-matmul,iree-preprocessing-convert-conv2d-to-img2col,iree-preprocessing-pad-linalg-ops{pad-size=32}))

Need more information? Set IREE_SAVE_TEMPS=/some/dir in your environment to save all artifacts and reproducers.

ai-superuser commented 1 year ago

Restart your shark without the --clear_all tag, it helped me

Xindaris commented 1 year ago

When I say the clear_all tag did not seem to change or fix anything, I mean that I tried it without it first, then with it after, and got exactly the same results.

RestleSSOtaKU commented 1 year ago

https://github.com/lshqqytiger/stable-diffusion-webui-directml

This helped me, works fine on my rx6600 with this webui-user.bat parameters: set COMMANDLINE_ARGS=--opt-sub-quad-attention --disable-nan-check --precision autocast --autolaunch

Installing usual safetensors models, loras, vaes, any extra extensions, all working fine. I'm so relieved ^_^