Closed MrManiacal closed 1 year ago
Hi @MrManiacal
In your command for using HuggingFace's ckpt
please include the ckpt
file as well as part of the path. So, effectively it'd look like --ckpt_loc="D:/nod/anything-v4.0/name_of_checkpoint.ckpt"
In order to use HugginFace's repo-id, I see you've not use hf_model_id
. I see no difference in the earlier command for CKPT and the one you've used to run a HuggingFace's repo-id.
Remove the --ckpt_loc
flag and please use --hf_model_id="andite/anything-v4.0"
(assuming you want to run andite/anything-v4.0)
NOTE: You might also want to include --no-use_tuned
flag along with both the commands if you're running it on Windows. :)
CC: @powderluv
I accidentally copied the same command twice. for the --hf_model_id, I get the following:
(shark.venv) PS D:\Nod\SHARK-main\shark\examples\shark_inference\stable_diffusion> python main.py --hf_model_id="andite/anything-v4.0" --max_length=77 --prompt="1girl, brown hair, green eyes, colorful, autumn, cumulonimbus clouds, lighting, blue sky, falling leaves, garden"
shark_tank local cache is located at C:\Users\M.local/shark_tank/ . You may change this by setting the --local_tank_cache= flag
Running StableDiffusion with the following config :-
Batch size : 1
Prompts : ['1girl, brown hair, green eyes, colorful, autumn, cumulonimbus clouds, lighting, blue sky, falling leaves, garden']
Runs : 1
Found device AMD Radeon RX 7900 XTX. Using target triple rdna3-7900-windows.
Using vulkan://00000000-0900-0000-0000-000000000000 tuned models for stablediffusion/fp16.
[WARNING] Only 13.68GB space available in D:\Nod\SHARK-main\shark\examples\shark_inference\stable_diffusion.
Traceback (most recent call last):
File "D:\Nod\SHARK-main\shark\examples\shark_inference\stable_diffusion\main.py", line 111, in
I ran the full command with the ckpt file and got this:
(shark.venv) PS D:\Nod\SHARK-main\shark\examples\shark_inference\stable_diffusion> python main.py --precision=fp16 --device=vulkan --prompt="tajmahal, oil on canvas, sunflowers, 4k, uhd" --max_length=64 --import_mlir --ckpt_loc="D:/nod/anything-v4.0/anything-v4.0-pruned-fp32.ckpt"
shark_tank local cache is located at C:\Users\M.local/shark_tank/ . You may change this by setting the --local_tank_cache= flag
Running StableDiffusion with the following config :-
Batch size : 1
Prompts : ['tajmahal, oil on canvas, sunflowers, 4k, uhd']
Runs : 1
Found device AMD Radeon RX 7900 XTX. Using target triple rdna3-7900-windows.
Using vulkan://00000000-0900-0000-0000-000000000000 tuned models for stablediffusion/fp16.
Created directory : anything-v4.0-pruned-fp32 at -> D:\nod\anything-v4.0
Downloaded SD to Diffusers converter
Traceback (most recent call last):
File "D:\Nod\SHARK-main\shark\examples\shark_inference\stable_diffusion\sd_to_diffusers.py", line 23, in enable_stack_trace
and create an issue at https://github.com/nod-ai/SHARK/issues
OK, saw the safetensors error. I had run the PIP command to install the safetensors previously, but did it again. Here is the latest error:
(shark.venv) PS D:\Nod\SHARK-main\shark\examples\shark_inference\stable_diffusion> python main.py --precision=fp16 --device=vulkan --prompt="tajmahal, oil on canvas, sunflowers, 4k, uhd" --max_length=64 --import_mlir --ckpt_loc="D:/nod/anything-v4.0/anything-v4.0-pruned-fp32.ckpt"
shark_tank local cache is located at C:\Users\M.local/shark_tank/ . You may change this by setting the --local_tank_cache= flag
Running StableDiffusion with the following config :-
Batch size : 1
Prompts : ['tajmahal, oil on canvas, sunflowers, 4k, uhd']
Runs : 1
Found device AMD Radeon RX 7900 XTX. Using target triple rdna3-7900-windows.
Using vulkan://00000000-0900-0000-0000-000000000000 tuned models for stablediffusion/fp16.
Created directory : anything-v4.0-pruned-fp32 at -> D:\nod\anything-v4.0
SD to Diffusers converter already exists
'wget' is not recognized as an internal or external command,
operable program or batch file.
Traceback (most recent call last):
File "D:\Nod\SHARK-main\shark\examples\shark_inference\stable_diffusion\sd_to_diffusers.py", line 912, in enable_stack_trace
and create an issue at https://github.com/nod-ai/SHARK/issues
Regarding hf_model_id
please use --import_mlir
flag as well. Refer a small example here.
Regarding the CKPT issue seems like an issue with wget
in Windows - we'll take a look at that. Thanks for reporting!
Used --import_mlir, still get "Cannot compile the model":
(shark.venv) PS D:\Nod\SHARK-main\shark\examples\shark_inference\stable_diffusion> python main.py --import_mlir --hf_model_id="andite/anything-v4.0" --max_length=77 --prompt="1girl, brown hair, green eyes, colorful, autumn, cumulonimbus clouds, lighting, blue sky, falling leaves, garden"
shark_tank local cache is located at C:\Users\M.local/shark_tank/ . You may change this by setting the --local_tank_cache= flag
Running StableDiffusion with the following config :-
Batch size : 1
Prompts : ['1girl, brown hair, green eyes, colorful, autumn, cumulonimbus clouds, lighting, blue sky, falling leaves, garden']
Runs : 1
Found device AMD Radeon RX 7900 XTX. Using target triple rdna3-7900-windows.
Using vulkan://00000000-0900-0000-0000-000000000000 tuned models for stablediffusion/fp16.
Downloading (…)_encoder/config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 617/617 [00:00<00:00, 617kB/s]
Downloading (…)"pytorch_model.bin";: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 492M/492M [00:07<00:00, 68.8MB/s]
D:\nod\shark-main\shark.venv\lib\site-packages\torch\jit_check.py:181: UserWarning: The TorchScript type system doesn't support instance-level annotations on empty non-base types in __init__
. Instead, either 1) use a type annotation in the class body, or 2) wrap the type in torch.jit.Attribute
.
warnings.warn("The TorchScript type system doesn't support "
No vmfb found. Compiling and saving to D:\Nod\SHARK-main\shark\examples\shark_inference\stable_diffusion\clip1_77_512_512_fp16_andite_anything_v4_0_vulkan-00000000-0900-0000-0000-000000000000.vmfb
Using target triple -iree-vulkan-target-triple=rdna3-7900-windows from command line args
Saved vmfb in D:\Nod\SHARK-main\shark\examples\shark_inference\stable_diffusion\clip1_77_512_512_fp16_andite_anything_v4_0_vulkan-00000000-0900-0000-0000-000000000000.vmfb.
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
Downloading (…)_pytorch_model.bin";: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3.44G/3.44G [00:56<00:00, 61.3MB/s]
Downloading (…)ain/unet/config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.02k/1.02k [00:00<00:00, 518kB/s]
Retrying with a different base model configuration
loading existing vmfb from: D:\Nod\SHARK-main\shark\examples\shark_inference\stable_diffusion\clip1_77_512_512_fp16_andite_anything_v4_0_vulkan-00000000-0900-0000-0000-000000000000.vmfb
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
Retrying with a different base model configuration
Cannot compile the model. Please use enable_stack_trace
and create an issue at https://github.com/nod-ai/SHARK/issues
I'll try doing a fresh install and trying again
No luck fresh install:
(shark.venv) PS D:\Nod\SHARK-main\shark\examples\shark_inference\stable_diffusion> python main.py --precision=fp16 --device=vulkan --prompt="tajmahal, oil on canvas, sunflowers, 4k, uhd" --max_length=64 --import_mlir --ckpt_loc="D:/nod/anything-v4.0/anything-v4.0-pruned-fp32.ckpt"
shark_tank local cache is located at C:\Users\M.local/shark_tank/ . You may change this by setting the --local_tank_cache= flag
Running StableDiffusion with the following config :-
Batch size : 1
Prompts : ['tajmahal, oil on canvas, sunflowers, 4k, uhd']
Runs : 1
Found device AMD Radeon RX 7900 XTX. Using target triple rdna3-7900-windows.
Using vulkan://00000000-0900-0000-0000-000000000000 tuned models for stablediffusion/fp16.
Created directory : anything-v4.0-pruned-fp32 at -> D:\nod\anything-v4.0
SD to Diffusers converter already exists
'wget' is not recognized as an internal or external command,
operable program or batch file.
Traceback (most recent call last):
File "D:\Nod\SHARK-main\shark\examples\shark_inference\stable_diffusion\sd_to_diffusers.py", line 912, in enable_stack_trace
and create an issue at https://github.com/nod-ai/SHARK/issues
(shark.venv) PS D:\Nod\SHARK-main\shark\examples\shark_inference\stable_diffusion> python main.py --import_mlir --hf_model_id="andite/anything-v4.0" --max_length=77 --prompt="1girl, brown hair, green eyes, colorful, autumn, cumulonimbus clouds, lighting, blue sky, falling leaves, garden"
shark_tank local cache is located at C:\Users\M.local/shark_tank/ . You may change this by setting the --local_tank_cache= flag
Running StableDiffusion with the following config :-
Batch size : 1
Prompts : ['1girl, brown hair, green eyes, colorful, autumn, cumulonimbus clouds, lighting, blue sky, falling leaves, garden']
Runs : 1
Found device AMD Radeon RX 7900 XTX. Using target triple rdna3-7900-windows.
Using vulkan://00000000-0900-0000-0000-000000000000 tuned models for stablediffusion/fp16.
D:\Nod\SHARK-main\shark.venv\lib\site-packages\torch\jit_check.py:181: UserWarning: The TorchScript type system doesn't support instance-level annotations on empty non-base types in __init__
. Instead, either 1) use a type annotation in the class body, or 2) wrap the type in torch.jit.Attribute
.
warnings.warn("The TorchScript type system doesn't support "
No vmfb found. Compiling and saving to D:\Nod\SHARK-main\shark\examples\shark_inference\stable_diffusion\clip1_77_512_512_fp16_andite_anything_v4_0_vulkan-00000000-0900-0000-0000-000000000000.vmfb
Using target triple -iree-vulkan-target-triple=rdna3-7900-windows from command line args
Saved vmfb in D:\Nod\SHARK-main\shark\examples\shark_inference\stable_diffusion\clip1_77_512_512_fp16_andite_anything_v4_0_vulkan-00000000-0900-0000-0000-000000000000.vmfb.
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
Cannot initialize model with low cpu memory usage because accelerate
was not found in the environment. Defaulting to low_cpu_mem_usage=False
. It is strongly recommended to install accelerate
for faster and less memory-intense model loading. You can do so with:
pip install accelerate
.
Retrying with a different base model configuration
loading existing vmfb from: D:\Nod\SHARK-main\shark\examples\shark_inference\stable_diffusion\clip1_77_512_512_fp16_andite_anything_v4_0_vulkan-00000000-0900-0000-0000-000000000000.vmfb
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
Cannot initialize model with low cpu memory usage because accelerate
was not found in the environment. Defaulting to low_cpu_mem_usage=False
. It is strongly recommended to install accelerate
for faster and less memory-intense model loading. You can do so with:
pip install accelerate
.
D:\Nod\SHARK-main\shark.venv\lib\site-packages\torch\fx\node.py:244: UserWarning: Trying to prepend a node to itself. This behavior has no effect on the graph.
warnings.warn("Trying to prepend a node to itself. This behavior has no effect on the graph.")
Loading Winograd config file from C:\Users\M.local/shark_tank/configs/unet_winograd_vulkan.json
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 107/107 [00:00<00:00, 823B/s]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 107/107 [00:00<00:00, 2.97kB/s]
Retrying with a different base model configuration
Cannot compile the model. Please use enable_stack_trace
and create an issue at https://github.com/nod-ai/SHARK/issues
so it looks like it's checking for inference.yaml files in the base directory of the user profile, i.e. c:\users\CurrentUser. once I grabbed the yaml files and stuck them there, I made more progress, but the error was the same:
(shark.venv) PS C:\Users\M> python D:\nod\shark-main\shark\examples\shark_inference\stable_diffusion\main.py --precision=fp16 --device=vulkan --prompt="tajmahal, oil on canvas, sunflowers, 4k, uhd" --max_length=64 --import_mlir --ckpt_loc="E:/nod/epicDiffusion_11.ckpt"
shark_tank local cache is located at C:\Users\M.local/shark_tank/ . You may change this by setting the --local_tank_cache= flag
Running StableDiffusion with the following config :-
Batch size : 1
Prompts : ['tajmahal, oil on canvas, sunflowers, 4k, uhd']
Runs : 1
Found device AMD Radeon RX 7900 XTX. Using target triple rdna3-7900-windows.
Using vulkan://00000000-0900-0000-0000-000000000000 tuned models for stablediffusion/fp16.
Created directory : epicDiffusion_11 at -> E:\nod
SD to Diffusers converter already exists
global_step key not found in model
Downloading (…)lve/main/config.json: 100%|████████████████████████████████████████| 4.52k/4.52k [00:00<00:00, 4.52MB/s]
D:\nod\shark-main\shark.venv\lib\site-packages\huggingface_hub\file_download.py:129: UserWarning: huggingface_hub
cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\Users\M.cache\huggingface\hub. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the HF_HUB_DISABLE_SYMLINKS_WARNING
environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
warnings.warn(message)
Downloading (…)"pytorch_model.bin";: 100%|████████████████████████████████████████| 1.71G/1.71G [00:17<00:00, 97.2MB/s]
Some weights of the model checkpoint at openai/clip-vit-large-patch14 were not used when initializing CLIPTextModel: ['vision_model.encoder.layers.16.self_attn.k_proj.weight', 'vision_model.encoder.layers.21.self_attn.q_proj.bias', 'vision_model.encoder.layers.1.self_attn.v_proj.bias', 'vision_model.encoder.layers.10.self_attn.out_proj.bias', 'vision_model.encoder.layers.10.self_attn.q_proj.weight', 'vision_model.encoder.layers.4.mlp.fc1.bias', 'vision_model.encoder.layers.6.mlp.fc2.weight', 'vision_model.encoder.layers.15.layer_norm1.weight', 'vision_model.encoder.layers.7.layer_norm1.weight', 'vision_model.encoder.layers.10.self_attn.v_proj.weight', 'vision_model.encoder.layers.22.self_attn.out_proj.bias', 'vision_model.encoder.layers.4.self_attn.v_proj.bias', 'vision_model.encoder.layers.8.self_attn.out_proj.bias', 'vision_model.encoder.layers.20.self_attn.out_proj.weight', 'vision_model.encoder.layers.7.layer_norm1.bias', 'vision_model.encoder.layers.22.mlp.fc2.weight', 'vision_model.encoder.layers.1.layer_norm1.weight', 'vision_model.encoder.layers.23.self_attn.out_proj.bias', 'vision_model.encoder.layers.17.mlp.fc2.bias', 'vision_model.encoder.layers.10.self_attn.q_proj.bias', 'vision_model.encoder.layers.15.self_attn.out_proj.weight', 'vision_model.encoder.layers.14.self_attn.out_proj.bias', 'vision_model.encoder.layers.15.self_attn.k_proj.bias', 'vision_model.encoder.layers.11.mlp.fc2.bias', 'vision_model.encoder.layers.3.layer_norm2.bias', 'vision_model.encoder.layers.20.self_attn.k_proj.bias', 'vision_model.encoder.layers.13.self_attn.k_proj.bias', 'vision_model.encoder.layers.2.self_attn.k_proj.weight', 'vision_model.encoder.layers.2.layer_norm1.weight', 'vision_model.encoder.layers.0.mlp.fc1.bias', 'vision_model.encoder.layers.7.self_attn.k_proj.bias', 'vision_model.encoder.layers.6.mlp.fc1.bias', 'vision_model.encoder.layers.22.self_attn.q_proj.weight', 'vision_model.encoder.layers.9.layer_norm1.weight', 'vision_model.encoder.layers.3.self_attn.q_proj.bias', 'vision_model.encoder.layers.3.mlp.fc2.weight', 'vision_model.encoder.layers.14.self_attn.q_proj.bias', 'vision_model.encoder.layers.4.mlp.fc2.bias', 'vision_model.post_layernorm.bias', 'vision_model.encoder.layers.13.self_attn.k_proj.weight', 'vision_model.encoder.layers.11.self_attn.k_proj.weight', 'vision_model.encoder.layers.15.mlp.fc2.bias', 'vision_model.encoder.layers.14.self_attn.k_proj.weight', 'vision_model.encoder.layers.5.layer_norm2.weight', 'vision_model.encoder.layers.3.self_attn.q_proj.weight', 'vision_model.encoder.layers.18.mlp.fc2.bias', 'vision_model.encoder.layers.14.self_attn.v_proj.weight', 'vision_model.encoder.layers.23.self_attn.q_proj.weight', 'vision_model.encoder.layers.5.self_attn.q_proj.bias', 'vision_model.encoder.layers.1.mlp.fc2.weight', 'vision_model.encoder.layers.23.mlp.fc1.weight', 'vision_model.encoder.layers.5.self_attn.out_proj.bias', 'vision_model.encoder.layers.2.self_attn.out_proj.bias', 'vision_model.encoder.layers.9.self_attn.q_proj.weight', 'vision_model.encoder.layers.5.mlp.fc1.bias', 'vision_model.encoder.layers.10.self_attn.out_proj.weight', 'vision_model.encoder.layers.12.self_attn.v_proj.bias', 'vision_model.encoder.layers.14.layer_norm1.weight', 'vision_model.encoder.layers.1.self_attn.q_proj.bias', 'vision_model.encoder.layers.3.mlp.fc1.bias', 'vision_model.encoder.layers.22.mlp.fc1.weight', 'vision_model.encoder.layers.12.mlp.fc2.weight', 'vision_model.encoder.layers.21.mlp.fc2.weight', 'vision_model.encoder.layers.18.self_attn.q_proj.bias', 'vision_model.encoder.layers.16.self_attn.k_proj.bias', 'vision_model.encoder.layers.3.self_attn.v_proj.bias', 'vision_model.encoder.layers.14.layer_norm2.weight', 'vision_model.encoder.layers.14.self_attn.k_proj.bias', 'vision_model.encoder.layers.20.layer_norm1.bias', 'vision_model.encoder.layers.3.self_attn.k_proj.weight', 'vision_model.encoder.layers.4.self_attn.q_proj.weight', 'vision_model.encoder.layers.1.self_attn.v_proj.weight', 'vision_model.encoder.layers.8.mlp.fc1.weight', 'vision_model.encoder.layers.5.self_attn.q_proj.weight', 'vision_model.encoder.layers.22.self_attn.k_proj.bias', 'vision_model.encoder.layers.16.self_attn.out_proj.bias', 'vision_model.encoder.layers.6.self_attn.out_proj.bias', 'vision_model.encoder.layers.7.layer_norm2.bias', 'vision_model.encoder.layers.23.layer_norm1.bias', 'vision_model.encoder.layers.4.self_attn.k_proj.bias', 'vision_model.encoder.layers.5.self_attn.v_proj.bias', 'vision_model.encoder.layers.19.self_attn.v_proj.weight', 'vision_model.encoder.layers.14.mlp.fc1.bias', 'vision_model.encoder.layers.1.self_attn.k_proj.bias', 'vision_model.encoder.layers.14.layer_norm1.bias', 'vision_model.encoder.layers.12.mlp.fc2.bias', 'vision_model.encoder.layers.21.mlp.fc1.weight', 'vision_model.encoder.layers.5.mlp.fc2.bias', 'vision_model.encoder.layers.22.mlp.fc1.bias', 'vision_model.encoder.layers.14.layer_norm2.bias', 'vision_model.encoder.layers.21.layer_norm1.bias', 'vision_model.encoder.layers.13.mlp.fc1.bias', 'vision_model.encoder.layers.8.layer_norm1.weight', 'vision_model.encoder.layers.11.layer_norm2.bias', 'vision_model.encoder.layers.12.self_attn.k_proj.bias', 'vision_model.encoder.layers.6.self_attn.q_proj.bias', 'vision_model.encoder.layers.19.mlp.fc2.weight', 'vision_model.encoder.layers.8.mlp.fc2.bias', 'vision_model.encoder.layers.19.layer_norm2.bias', 'vision_model.encoder.layers.22.layer_norm1.weight', 'vision_model.encoder.layers.11.self_attn.v_proj.weight', 'vision_model.encoder.layers.17.self_attn.out_proj.weight', 'vision_model.encoder.layers.14.self_attn.q_proj.weight', 'vision_model.post_layernorm.weight', 'vision_model.encoder.layers.18.layer_norm1.weight', 'vision_model.encoder.layers.6.mlp.fc2.bias', 'vision_model.encoder.layers.13.self_attn.q_proj.bias', 'vision_model.encoder.layers.6.layer_norm2.weight', 'vision_model.encoder.layers.9.mlp.fc1.weight', 'vision_model.encoder.layers.22.self_attn.out_proj.weight', 'vision_model.encoder.layers.18.layer_norm2.bias', 'vision_model.encoder.layers.0.layer_norm2.weight', 'vision_model.encoder.layers.8.mlp.fc1.bias', 'vision_model.encoder.layers.8.layer_norm2.weight', 'vision_model.encoder.layers.13.self_attn.v_proj.bias', 'vision_model.encoder.layers.20.layer_norm2.bias', 'vision_model.encoder.layers.17.self_attn.v_proj.bias', 'vision_model.encoder.layers.3.layer_norm1.bias', 'vision_model.embeddings.position_ids', 'vision_model.encoder.layers.4.mlp.fc1.weight', 'vision_model.encoder.layers.19.self_attn.out_proj.bias', 'vision_model.encoder.layers.5.mlp.fc1.weight', 'vision_model.encoder.layers.9.mlp.fc2.weight', 'vision_model.encoder.layers.7.layer_norm2.weight', 'vision_model.encoder.layers.16.mlp.fc1.bias', 'vision_model.encoder.layers.16.layer_norm1.bias', 'vision_model.encoder.layers.17.mlp.fc1.weight', 'vision_model.encoder.layers.11.layer_norm2.weight', 'vision_model.encoder.layers.6.layer_norm1.bias', 'vision_model.encoder.layers.12.self_attn.q_proj.weight', 'vision_model.encoder.layers.19.mlp.fc1.bias', 'vision_model.encoder.layers.9.layer_norm2.weight', 'vision_model.encoder.layers.3.self_attn.out_proj.weight', 'vision_model.encoder.layers.21.layer_norm1.weight', 'vision_model.encoder.layers.8.self_attn.k_proj.bias', 'vision_model.encoder.layers.22.mlp.fc2.bias', 'vision_model.encoder.layers.15.self_attn.k_proj.weight', 'vision_model.encoder.layers.1.mlp.fc2.bias', 'vision_model.encoder.layers.9.self_attn.k_proj.weight', 'vision_model.encoder.layers.16.mlp.fc1.weight', 'vision_model.encoder.layers.20.self_attn.v_proj.bias', 'vision_model.encoder.layers.16.layer_norm1.weight', 'vision_model.encoder.layers.13.layer_norm2.weight', 'vision_model.encoder.layers.13.self_attn.q_proj.weight', 'vision_model.encoder.layers.7.self_attn.v_proj.bias', 'vision_model.encoder.layers.2.self_attn.v_proj.bias', 'vision_model.encoder.layers.14.mlp.fc1.weight', 'vision_model.encoder.layers.0.self_attn.k_proj.bias', 'vision_model.encoder.layers.7.self_attn.out_proj.bias', 'vision_model.encoder.layers.23.self_attn.k_proj.bias', 'vision_model.encoder.layers.11.mlp.fc1.bias', 'vision_model.encoder.layers.11.self_attn.out_proj.bias', 'vision_model.embeddings.class_embedding', 'vision_model.encoder.layers.5.mlp.fc2.weight', 'visual_projection.weight', 'vision_model.encoder.layers.18.self_attn.v_proj.weight', 'vision_model.encoder.layers.3.self_attn.k_proj.bias', 'vision_model.encoder.layers.7.mlp.fc1.bias', 'vision_model.encoder.layers.12.mlp.fc1.weight', 'vision_model.encoder.layers.7.self_attn.q_proj.weight', 'vision_model.encoder.layers.2.self_attn.q_proj.bias', 'vision_model.encoder.layers.0.self_attn.out_proj.weight', 'vision_model.embeddings.patch_embedding.weight', 'vision_model.encoder.layers.22.layer_norm2.weight', 'vision_model.encoder.layers.13.self_attn.out_proj.weight', 'vision_model.encoder.layers.20.self_attn.q_proj.weight', 'vision_model.encoder.layers.10.layer_norm2.weight', 'vision_model.encoder.layers.2.self_attn.q_proj.weight', 'vision_model.encoder.layers.2.layer_norm2.bias', 'vision_model.encoder.layers.0.layer_norm1.weight', 'vision_model.encoder.layers.4.layer_norm1.bias', 'vision_model.encoder.layers.8.self_attn.q_proj.weight', 'vision_model.embeddings.position_embedding.weight', 'vision_model.encoder.layers.17.layer_norm1.bias', 'vision_model.encoder.layers.10.mlp.fc1.weight', 'vision_model.encoder.layers.13.layer_norm2.bias', 'vision_model.encoder.layers.5.layer_norm1.weight', 'vision_model.encoder.layers.7.self_attn.k_proj.weight', 'vision_model.encoder.layers.11.self_attn.out_proj.weight', 'vision_model.encoder.layers.1.self_attn.k_proj.weight', 'vision_model.encoder.layers.19.self_attn.k_proj.bias', 'vision_model.encoder.layers.20.self_attn.k_proj.weight', 'vision_model.encoder.layers.2.mlp.fc1.bias', 'vision_model.encoder.layers.17.layer_norm2.weight', 'vision_model.encoder.layers.19.self_attn.k_proj.weight', 'vision_model.encoder.layers.20.mlp.fc1.weight', 'vision_model.encoder.layers.4.self_attn.q_proj.bias', 'vision_model.encoder.layers.1.self_attn.out_proj.weight', 'vision_model.encoder.layers.13.self_attn.v_proj.weight', 'vision_model.encoder.layers.9.mlp.fc2.bias', 'vision_model.pre_layrnorm.bias', 'vision_model.encoder.layers.5.self_attn.out_proj.weight', 'vision_model.encoder.layers.7.mlp.fc2.bias', 'vision_model.encoder.layers.3.layer_norm2.weight', 'vision_model.encoder.layers.20.mlp.fc2.bias', 'vision_model.encoder.layers.21.self_attn.k_proj.bias', 'vision_model.encoder.layers.23.self_attn.q_proj.bias', 'vision_model.encoder.layers.9.self_attn.k_proj.bias', 'vision_model.encoder.layers.14.mlp.fc2.bias', 'vision_model.encoder.layers.18.layer_norm2.weight', 'vision_model.encoder.layers.12.layer_norm1.weight', 'vision_model.encoder.layers.16.layer_norm2.weight', 'vision_model.encoder.layers.6.self_attn.k_proj.bias', 'vision_model.encoder.layers.19.self_attn.out_proj.weight', 'vision_model.encoder.layers.20.mlp.fc2.weight', 'vision_model.encoder.layers.19.layer_norm1.weight', 'vision_model.encoder.layers.23.mlp.fc1.bias', 'vision_model.encoder.layers.0.mlp.fc2.weight', 'vision_model.encoder.layers.18.self_attn.q_proj.weight', 'vision_model.encoder.layers.21.mlp.fc2.bias', 'vision_model.encoder.layers.13.mlp.fc2.bias', 'vision_model.encoder.layers.14.self_attn.out_proj.weight', 'vision_model.encoder.layers.19.self_attn.q_proj.weight', 'vision_model.encoder.layers.17.mlp.fc1.bias', 'vision_model.encoder.layers.0.mlp.fc2.bias', 'vision_model.encoder.layers.5.self_attn.k_proj.weight', 'vision_model.encoder.layers.0.self_attn.v_proj.bias', 'vision_model.encoder.layers.0.self_attn.q_proj.weight', 'vision_model.encoder.layers.12.self_attn.v_proj.weight', 'vision_model.encoder.layers.14.mlp.fc2.weight', 'vision_model.encoder.layers.16.self_attn.v_proj.bias', 'vision_model.encoder.layers.21.self_attn.v_proj.bias', 'vision_model.encoder.layers.16.self_attn.q_proj.weight', 'vision_model.encoder.layers.5.self_attn.v_proj.weight', 'vision_model.encoder.layers.11.layer_norm1.bias', 'vision_model.encoder.layers.18.self_attn.out_proj.bias', 'vision_model.encoder.layers.2.self_attn.v_proj.weight', 'vision_model.encoder.layers.11.mlp.fc2.weight', 'vision_model.encoder.layers.18.self_attn.v_proj.bias', 'vision_model.encoder.layers.16.mlp.fc2.weight', 'vision_model.encoder.layers.16.self_attn.q_proj.bias', 'vision_model.encoder.layers.11.self_attn.v_proj.bias', 'vision_model.encoder.layers.22.self_attn.q_proj.bias', 'vision_model.encoder.layers.2.self_attn.k_proj.bias', 'vision_model.encoder.layers.18.layer_norm1.bias', 'vision_model.encoder.layers.3.self_attn.v_proj.weight', 'vision_model.encoder.layers.8.mlp.fc2.weight', 'vision_model.encoder.layers.16.mlp.fc2.bias', 'vision_model.encoder.layers.2.mlp.fc2.weight', 'vision_model.encoder.layers.15.self_attn.q_proj.weight', 'vision_model.encoder.layers.17.self_attn.q_proj.bias', 'vision_model.encoder.layers.9.self_attn.out_proj.weight', 'vision_model.encoder.layers.15.self_attn.v_proj.bias', 'vision_model.encoder.layers.15.layer_norm2.bias', 'vision_model.encoder.layers.19.self_attn.q_proj.bias', 'vision_model.encoder.layers.20.layer_norm1.weight', 'vision_model.encoder.layers.7.self_attn.v_proj.weight', 'vision_model.encoder.layers.18.self_attn.k_proj.bias', 'vision_model.encoder.layers.6.mlp.fc1.weight', 'vision_model.encoder.layers.5.layer_norm2.bias', 'vision_model.encoder.layers.20.self_attn.v_proj.weight', 'vision_model.encoder.layers.0.self_attn.out_proj.bias', 'vision_model.encoder.layers.12.layer_norm1.bias', 'vision_model.encoder.layers.15.self_attn.q_proj.bias', 'vision_model.encoder.layers.15.mlp.fc2.weight', 'vision_model.encoder.layers.13.mlp.fc1.weight', 'vision_model.encoder.layers.23.mlp.fc2.weight', 'vision_model.encoder.layers.0.layer_norm1.bias', 'vision_model.encoder.layers.0.self_attn.v_proj.weight', 'vision_model.encoder.layers.23.self_attn.k_proj.weight', 'vision_model.encoder.layers.6.layer_norm2.bias', 'vision_model.encoder.layers.18.mlp.fc2.weight', 'vision_model.encoder.layers.1.layer_norm2.bias', 'vision_model.encoder.layers.6.layer_norm1.weight', 'vision_model.encoder.layers.6.self_attn.v_proj.bias', 'vision_model.encoder.layers.4.mlp.fc2.weight', 'vision_model.encoder.layers.10.mlp.fc2.weight', 'vision_model.encoder.layers.15.layer_norm2.weight', 'vision_model.encoder.layers.14.self_attn.v_proj.bias', 'vision_model.encoder.layers.9.self_attn.q_proj.bias', 'vision_model.encoder.layers.21.self_attn.v_proj.weight', 'vision_model.encoder.layers.12.self_attn.out_proj.weight', 'vision_model.encoder.layers.23.mlp.fc2.bias', 'logit_scale', 'vision_model.encoder.layers.23.self_attn.out_proj.weight', 'vision_model.encoder.layers.13.layer_norm1.weight', 'vision_model.encoder.layers.16.self_attn.v_proj.weight', 'vision_model.encoder.layers.10.layer_norm1.bias', 'vision_model.encoder.layers.9.self_attn.v_proj.weight', 'vision_model.encoder.layers.17.self_attn.out_proj.bias', 'vision_model.encoder.layers.9.self_attn.v_proj.bias', 'vision_model.encoder.layers.12.layer_norm2.bias', 'vision_model.encoder.layers.8.layer_norm1.bias', 'vision_model.encoder.layers.10.self_attn.v_proj.bias', 'vision_model.encoder.layers.17.self_attn.k_proj.weight', 'vision_model.encoder.layers.5.layer_norm1.bias', 'vision_model.encoder.layers.12.mlp.fc1.bias', 'vision_model.encoder.layers.0.mlp.fc1.weight', 'vision_model.encoder.layers.10.layer_norm1.weight', 'vision_model.encoder.layers.23.self_attn.v_proj.bias', 'vision_model.encoder.layers.17.self_attn.q_proj.weight', 'vision_model.encoder.layers.1.mlp.fc1.weight', 'vision_model.encoder.layers.22.layer_norm2.bias', 'vision_model.encoder.layers.17.mlp.fc2.weight', 'vision_model.encoder.layers.11.layer_norm1.weight', 'vision_model.encoder.layers.17.layer_norm1.weight', 'vision_model.encoder.layers.9.layer_norm1.bias', 'vision_model.encoder.layers.0.self_attn.q_proj.bias', 'vision_model.encoder.layers.21.self_attn.out_proj.bias', 'vision_model.encoder.layers.11.self_attn.q_proj.weight', 'vision_model.encoder.layers.23.layer_norm2.bias', 'vision_model.encoder.layers.6.self_attn.k_proj.weight', 'vision_model.encoder.layers.0.self_attn.k_proj.weight', 'vision_model.encoder.layers.15.self_attn.out_proj.bias', 'vision_model.encoder.layers.12.layer_norm2.weight', 'vision_model.encoder.layers.8.self_attn.v_proj.bias', 'vision_model.encoder.layers.3.self_attn.out_proj.bias', 'vision_model.encoder.layers.23.self_attn.v_proj.weight', 'vision_model.encoder.layers.20.mlp.fc1.bias', 'vision_model.encoder.layers.22.self_attn.v_proj.bias', 'vision_model.encoder.layers.15.layer_norm1.bias', 'vision_model.encoder.layers.20.layer_norm2.weight', 'vision_model.encoder.layers.11.self_attn.q_proj.bias', 'vision_model.encoder.layers.6.self_attn.out_proj.weight', 'vision_model.encoder.layers.7.mlp.fc1.weight', 'vision_model.encoder.layers.13.self_attn.out_proj.bias', 'vision_model.encoder.layers.15.self_attn.v_proj.weight', 'vision_model.encoder.layers.1.self_attn.q_proj.weight', 'vision_model.encoder.layers.6.self_attn.v_proj.weight', 'vision_model.encoder.layers.12.self_attn.out_proj.bias', 'vision_model.pre_layrnorm.weight', 'vision_model.encoder.layers.10.mlp.fc2.bias', 'vision_model.encoder.layers.18.mlp.fc1.weight', 'vision_model.encoder.layers.3.mlp.fc1.weight', 'vision_model.encoder.layers.8.self_attn.k_proj.weight', 'vision_model.encoder.layers.18.mlp.fc1.bias', 'vision_model.encoder.layers.9.mlp.fc1.bias', 'vision_model.encoder.layers.1.mlp.fc1.bias', 'vision_model.encoder.layers.2.mlp.fc2.bias', 'vision_model.encoder.layers.21.self_attn.k_proj.weight', 'vision_model.encoder.layers.20.self_attn.q_proj.bias', 'vision_model.encoder.layers.22.self_attn.v_proj.weight', 'vision_model.encoder.layers.8.self_attn.q_proj.bias', 'vision_model.encoder.layers.15.mlp.fc1.weight', 'vision_model.encoder.layers.13.layer_norm1.bias', 'vision_model.encoder.layers.1.layer_norm1.bias', 'vision_model.encoder.layers.2.self_attn.out_proj.weight', 'vision_model.encoder.layers.1.self_attn.out_proj.bias', 'vision_model.encoder.layers.4.self_attn.v_proj.weight', 'vision_model.encoder.layers.10.layer_norm2.bias', 'vision_model.encoder.layers.11.self_attn.k_proj.bias', 'vision_model.encoder.layers.4.layer_norm2.bias', 'vision_model.encoder.layers.8.self_attn.v_proj.weight', 'vision_model.encoder.layers.10.self_attn.k_proj.weight', 'vision_model.encoder.layers.4.self_attn.out_proj.weight', 'vision_model.encoder.layers.20.self_attn.out_proj.bias', 'vision_model.encoder.layers.4.self_attn.out_proj.bias', 'vision_model.encoder.layers.21.self_attn.out_proj.weight', 'vision_model.encoder.layers.5.self_attn.k_proj.bias', 'vision_model.encoder.layers.21.layer_norm2.bias', 'vision_model.encoder.layers.19.layer_norm2.weight', 'vision_model.encoder.layers.22.layer_norm1.bias', 'text_projection.weight', 'vision_model.encoder.layers.17.layer_norm2.bias', 'vision_model.encoder.layers.19.self_attn.v_proj.bias', 'vision_model.encoder.layers.8.layer_norm2.bias', 'vision_model.encoder.layers.10.mlp.fc1.bias', 'vision_model.encoder.layers.2.layer_norm1.bias', 'vision_model.encoder.layers.16.layer_norm2.bias', 'vision_model.encoder.layers.22.self_attn.k_proj.weight', 'vision_model.encoder.layers.19.mlp.fc1.weight', 'vision_model.encoder.layers.11.mlp.fc1.weight', 'vision_model.encoder.layers.0.layer_norm2.bias', 'vision_model.encoder.layers.2.mlp.fc1.weight', 'vision_model.encoder.layers.4.layer_norm2.weight', 'vision_model.encoder.layers.10.self_attn.k_proj.bias', 'vision_model.encoder.layers.7.self_attn.q_proj.bias', 'vision_model.encoder.layers.3.layer_norm1.weight', 'vision_model.encoder.layers.6.self_attn.q_proj.weight', 'vision_model.encoder.layers.13.mlp.fc2.weight', 'vision_model.encoder.layers.21.layer_norm2.weight', 'vision_model.encoder.layers.9.self_attn.out_proj.bias', 'vision_model.encoder.layers.17.self_attn.k_proj.bias', 'vision_model.encoder.layers.9.layer_norm2.bias', 'vision_model.encoder.layers.21.mlp.fc1.bias', 'vision_model.encoder.layers.4.layer_norm1.weight', 'vision_model.encoder.layers.8.self_attn.out_proj.weight', 'vision_model.encoder.layers.17.self_attn.v_proj.weight', 'vision_model.encoder.layers.16.self_attn.out_proj.weight', 'vision_model.encoder.layers.23.layer_norm1.weight', 'vision_model.encoder.layers.12.self_attn.q_proj.bias', 'vision_model.encoder.layers.7.self_attn.out_proj.weight', 'vision_model.encoder.layers.18.self_attn.k_proj.weight', 'vision_model.encoder.layers.3.mlp.fc2.bias', 'vision_model.encoder.layers.19.mlp.fc2.bias', 'vision_model.encoder.layers.21.self_attn.q_proj.weight', 'vision_model.encoder.layers.15.mlp.fc1.bias', 'vision_model.encoder.layers.1.layer_norm2.weight', 'vision_model.encoder.layers.12.self_attn.k_proj.weight', 'vision_model.encoder.layers.18.self_attn.out_proj.weight', 'vision_model.encoder.layers.23.layer_norm2.weight', 'vision_model.encoder.layers.4.self_attn.k_proj.weight', 'vision_model.encoder.layers.19.layer_norm1.bias', 'vision_model.encoder.layers.7.mlp.fc2.weight', 'vision_model.encoder.layers.2.layer_norm2.weight']
__init__
. Instead, either 1) use a type annotation in the class body, or 2) wrap the type in torch.jit.Attribute
.
warnings.warn("The TorchScript type system doesn't support "
No vmfb found. Compiling and saving to C:\Users\M\clip1_64_512_512_fp16_E_nod_epicDiffusion_11_vulkan-00000000-0900-0000-0000-000000000000.vmfb
Using target triple -iree-vulkan-target-triple=rdna3-7900-windows from command line args
Saved vmfb in C:\Users\M\clip1_64_512_512_fp16_E_nod_epicDiffusion_11_vulkan-00000000-0900-0000-0000-000000000000.vmfb.
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
Retrying with a different base model configuration
loading existing vmfb from: C:\Users\M\clip1_64_512_512_fp16_E_nod_epicDiffusion_11_vulkan-00000000-0900-0000-0000-000000000000.vmfb
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
Retrying with a different base model configuration
Cannot compile the model. Please use enable_stack_trace
and create an issue at https://github.com/nod-ai/SHARK/issuesHi @MrManiacal , these logs don't shed much light into the real issue.
Please use --enable_stack_trace
flag as mentioned while running the above set of commands :-
Cannot compile the model. Please use `enable_stack_trace` and create an issue at https://github.com/nod-ai/SHARK/issues
Also, not sure if you're using the latest branch but can you also try using --no-use_tuned
flag?
Therefore effectively you'd need to use import_mlir
+ --no-use_tuned
+ --enable_stack_trace
for both commands :-
--hf_model_id
--ckpt_loc
I'm using the latest main branch.
Here's the result when I do the HF model ID command:
(shark.venv) PS D:\Nod\SHARK-main>> python D:\nod\shark-main\shark\examples\shark_inference\stable_diffusion\main.py --hf_model_id="andite/anything-v4.0" --max_length=77 --prompt="tajmahal, oil on canvas, sunflowers, 4k, uhd" --enable_stack_trace --no-use_tuned
shark_tank local cache is located at C:\Users\M.local/shark_tank/ . You may change this by setting the --local_tank_cache= flag
Running StableDiffusion with the following config :-
Batch size : 1
Prompts : ['tajmahal, oil on canvas, sunflowers, 4k, uhd']
Runs : 1
Found device AMD Radeon RX 7900 XTX. Using target triple rdna3-7900-windows.
Tuned models are currently not supported for this setting.
Traceback (most recent call last):
File "D:\nod\shark-main\shark\examples\shark_inference\stable_diffusion\main.py", line 111, in
However, with the inference files local, I do get this with the local checkpooint: (shark.venv) PS D:\Nod\SHARK-main>> python D:\nod\shark-main\shark\examples\shark_inference\stable_diffusion\main.py --max_length=77 --prompt="1girl, brown hair, green eyes, colorful, Winter, cumulonimbus clouds, lighting, blue sky, falling snow, garden" --import_mlir --enable_stack_trace --no-use_tuned --ckpt_loc="E:/Nod/anything-v4.0/anything-v4.0-pruned-fp32.ckpt" shark_tank local cache is located at C:\Users\M.local/shark_tank/ . You may change this by setting the --local_tank_cache= flag Running StableDiffusion with the following config :- Batch size : 1 Prompts : ['1girl, brown hair, green eyes, colorful, Winter, cumulonimbus clouds, lighting, blue sky, falling snow, garden'] Runs : 1 Found device AMD Radeon RX 7900 XTX. Using target triple rdna3-7900-windows. Tuned models are currently not supported for this setting. Created directory : anything-v4.0-pruned-fp32 at -> E:\Nod\anything-v4.0 SD to Diffusers converter already exists Some weights of the model checkpoint at openai/clip-vit-large-patch14 were not used when initializing CLIPTextModel: ['vision_model.encoder.layers.8.layer_norm2.weight', 'vision_model.encoder.layers.13.layer_norm2.weight', 'vision_model.encoder.layers.5.self_attn.q_proj.weight', 'vision_model.encoder.layers.17.layer_norm2.bias', 'vision_model.encoder.layers.18.layer_norm1.weight', 'vision_model.encoder.layers.9.mlp.fc1.weight', 'vision_model.encoder.layers.8.self_attn.q_proj.bias', 'vision_model.encoder.layers.16.self_attn.k_proj.weight', 'vision_model.encoder.layers.8.self_attn.v_proj.bias', 'vision_model.encoder.layers.7.layer_norm2.weight', 'vision_model.encoder.layers.17.self_attn.v_proj.bias', 'vision_model.encoder.layers.19.self_attn.k_proj.weight', 'vision_model.encoder.layers.10.mlp.fc2.weight', 'vision_model.encoder.layers.0.self_attn.k_proj.weight', 'vision_model.encoder.layers.11.self_attn.v_proj.weight', 'vision_model.encoder.layers.0.mlp.fc2.bias', 'vision_model.encoder.layers.15.self_attn.q_proj.weight', 'vision_model.encoder.layers.14.mlp.fc2.bias', 'vision_model.encoder.layers.4.mlp.fc1.bias', 'vision_model.encoder.layers.8.mlp.fc2.bias', 'vision_model.encoder.layers.11.layer_norm2.bias', 'vision_model.encoder.layers.18.self_attn.out_proj.weight', 'vision_model.encoder.layers.16.layer_norm2.weight', 'vision_model.encoder.layers.15.layer_norm2.bias', 'vision_model.encoder.layers.5.layer_norm2.bias', 'vision_model.encoder.layers.4.self_attn.v_proj.weight', 'vision_model.encoder.layers.13.mlp.fc2.bias', 'vision_model.encoder.layers.11.mlp.fc1.weight', 'vision_model.encoder.layers.6.mlp.fc2.weight', 'vision_model.encoder.layers.16.mlp.fc1.bias', 'vision_model.encoder.layers.10.self_attn.q_proj.weight', 'vision_model.encoder.layers.22.mlp.fc2.weight', 'vision_model.encoder.layers.1.layer_norm1.weight', 'vision_model.encoder.layers.15.self_attn.out_proj.bias', 'vision_model.encoder.layers.2.layer_norm1.bias', 'vision_model.encoder.layers.9.layer_norm1.weight', 'vision_model.encoder.layers.10.mlp.fc1.weight', 'vision_model.encoder.layers.2.layer_norm1.weight', 'vision_model.encoder.layers.12.self_attn.k_proj.weight', 'vision_model.encoder.layers.2.layer_norm2.bias', 'vision_model.encoder.layers.10.layer_norm1.weight', 'vision_model.encoder.layers.1.mlp.fc1.bias', 'vision_model.encoder.layers.2.self_attn.q_proj.bias', 'vision_model.encoder.layers.17.self_attn.v_proj.weight', 'vision_model.encoder.layers.0.self_attn.v_proj.bias', 'vision_model.encoder.layers.12.self_attn.out_proj.weight', 'vision_model.encoder.layers.7.self_attn.out_proj.bias', 'vision_model.encoder.layers.11.self_attn.q_proj.bias', 'vision_model.encoder.layers.12.self_attn.q_proj.bias', 'vision_model.encoder.layers.9.self_attn.k_proj.bias', 'vision_model.encoder.layers.23.mlp.fc1.bias', 'vision_model.encoder.layers.16.layer_norm1.bias', 'vision_model.encoder.layers.0.self_attn.k_proj.bias', 'vision_model.encoder.layers.18.layer_norm2.weight', 'vision_model.encoder.layers.6.self_attn.k_proj.weight', 'vision_model.encoder.layers.1.mlp.fc2.weight', 'vision_model.encoder.layers.10.mlp.fc1.bias', 'vision_model.encoder.layers.15.self_attn.out_proj.weight', 'vision_model.encoder.layers.5.self_attn.out_proj.weight', 'vision_model.encoder.layers.23.layer_norm1.weight', 'vision_model.encoder.layers.21.self_attn.out_proj.weight', 'vision_model.encoder.layers.3.self_attn.v_proj.bias', 'vision_model.encoder.layers.13.self_attn.k_proj.bias', 'vision_model.encoder.layers.7.layer_norm1.bias', 'vision_model.encoder.layers.12.layer_norm1.weight', 'vision_model.encoder.layers.21.self_attn.k_proj.weight', 'vision_model.encoder.layers.19.layer_norm2.weight', 'vision_model.encoder.layers.20.layer_norm2.weight', 'logit_scale', 'vision_model.encoder.layers.22.layer_norm2.weight', 'vision_model.encoder.layers.11.self_attn.q_proj.weight', 'vision_model.encoder.layers.19.self_attn.out_proj.bias', 'vision_model.encoder.layers.14.mlp.fc1.bias', 'vision_model.encoder.layers.16.layer_norm1.weight', 'vision_model.encoder.layers.23.self_attn.v_proj.weight', 'vision_model.encoder.layers.7.self_attn.v_proj.bias', 'vision_model.encoder.layers.1.mlp.fc1.weight', 'vision_model.encoder.layers.4.layer_norm2.bias', 'vision_model.encoder.layers.13.mlp.fc1.bias', 'vision_model.encoder.layers.20.self_attn.out_proj.weight', 'vision_model.encoder.layers.1.self_attn.q_proj.weight', 'vision_model.encoder.layers.7.mlp.fc2.weight', 'vision_model.encoder.layers.8.self_attn.v_proj.weight', 'vision_model.encoder.layers.2.self_attn.q_proj.weight', 'vision_model.encoder.layers.9.layer_norm2.bias', 'vision_model.encoder.layers.12.layer_norm2.weight', 'vision_model.encoder.layers.11.self_attn.k_proj.bias', 'vision_model.encoder.layers.2.self_attn.v_proj.bias', 'vision_model.encoder.layers.17.self_attn.q_proj.bias', 'vision_model.encoder.layers.8.mlp.fc2.weight', 'vision_model.encoder.layers.18.mlp.fc2.bias', 'vision_model.encoder.layers.11.mlp.fc2.weight', 'vision_model.encoder.layers.10.self_attn.v_proj.bias', 'vision_model.encoder.layers.4.mlp.fc2.weight', 'vision_model.encoder.layers.16.self_attn.q_proj.bias', 'vision_model.encoder.layers.1.self_attn.out_proj.weight', 'vision_model.encoder.layers.12.self_attn.k_proj.bias', 'vision_model.encoder.layers.19.mlp.fc1.bias', 'vision_model.encoder.layers.2.self_attn.out_proj.weight', 'vision_model.encoder.layers.11.self_attn.out_proj.weight', 'vision_model.encoder.layers.21.mlp.fc2.weight', 'vision_model.encoder.layers.3.self_attn.q_proj.bias', 'vision_model.encoder.layers.0.layer_norm1.weight', 'vision_model.encoder.layers.8.layer_norm2.bias', 'vision_model.pre_layrnorm.bias', 'vision_model.encoder.layers.18.mlp.fc1.weight', 'vision_model.encoder.layers.5.layer_norm1.bias', 'vision_model.encoder.layers.9.layer_norm2.weight', 'vision_model.encoder.layers.6.self_attn.out_proj.weight', 'vision_model.encoder.layers.20.mlp.fc2.weight', 'vision_model.encoder.layers.6.self_attn.v_proj.weight', 'vision_model.encoder.layers.7.self_attn.q_proj.bias', 'vision_model.encoder.layers.3.self_attn.out_proj.weight', 'vision_model.encoder.layers.13.self_attn.k_proj.weight', 'vision_model.encoder.layers.23.layer_norm2.weight', 'vision_model.encoder.layers.14.mlp.fc2.weight', 'vision_model.encoder.layers.22.self_attn.k_proj.weight', 'vision_model.encoder.layers.6.layer_norm1.bias', 'vision_model.encoder.layers.3.layer_norm1.bias', 'vision_model.encoder.layers.19.mlp.fc1.weight', 'vision_model.encoder.layers.21.layer_norm1.weight', 'vision_model.encoder.layers.22.layer_norm1.weight', 'vision_model.encoder.layers.12.mlp.fc1.weight', 'vision_model.encoder.layers.21.self_attn.q_proj.weight', 'vision_model.encoder.layers.12.layer_norm2.bias', 'vision_model.encoder.layers.6.self_attn.q_proj.bias', 'vision_model.encoder.layers.4.mlp.fc2.bias', 'vision_model.encoder.layers.14.layer_norm2.weight', 'vision_model.encoder.layers.19.layer_norm2.bias', 'vision_model.encoder.layers.22.self_attn.k_proj.bias', 'vision_model.encoder.layers.16.self_attn.out_proj.weight', 'vision_model.encoder.layers.18.self_attn.q_proj.weight', 'vision_model.encoder.layers.10.layer_norm2.weight', 'vision_model.encoder.layers.16.self_attn.v_proj.weight', 'vision_model.encoder.layers.22.self_attn.v_proj.weight', 'vision_model.encoder.layers.23.layer_norm2.bias', 'vision_model.encoder.layers.4.self_attn.out_proj.weight', 'vision_model.encoder.layers.8.self_attn.q_proj.weight', 'vision_model.encoder.layers.23.self_attn.k_proj.weight', 'vision_model.encoder.layers.1.self_attn.v_proj.weight', 'vision_model.encoder.layers.18.mlp.fc2.weight', 'vision_model.encoder.layers.5.self_attn.v_proj.bias', 'vision_model.encoder.layers.14.layer_norm1.bias', 'vision_model.encoder.layers.23.self_attn.k_proj.bias', 'vision_model.encoder.layers.5.mlp.fc2.bias', 'vision_model.encoder.layers.7.mlp.fc2.bias', 'vision_model.encoder.layers.12.self_attn.v_proj.weight', 'vision_model.encoder.layers.3.self_attn.out_proj.bias', 'vision_model.encoder.layers.13.self_attn.v_proj.bias', 'vision_model.encoder.layers.16.mlp.fc2.bias', 'vision_model.encoder.layers.8.layer_norm1.bias', 'vision_model.encoder.layers.7.mlp.fc1.weight', 'vision_model.encoder.layers.18.self_attn.v_proj.bias', 'vision_model.encoder.layers.14.self_attn.q_proj.bias', 'vision_model.encoder.layers.21.self_attn.v_proj.bias', 'vision_model.encoder.layers.22.mlp.fc1.weight', 'vision_model.encoder.layers.15.self_attn.k_proj.weight', 'vision_model.encoder.layers.13.mlp.fc1.weight', 'vision_model.encoder.layers.22.mlp.fc1.bias', 'vision_model.encoder.layers.10.self_attn.v_proj.weight', 'vision_model.encoder.layers.14.self_attn.v_proj.bias', 'vision_model.encoder.layers.13.self_attn.q_proj.bias', 'vision_model.encoder.layers.12.self_attn.q_proj.weight', 'vision_model.encoder.layers.15.self_attn.k_proj.bias', 'vision_model.encoder.layers.0.mlp.fc2.weight', 'vision_model.encoder.layers.17.self_attn.k_proj.weight', 'vision_model.encoder.layers.9.self_attn.v_proj.weight', 'vision_model.encoder.layers.17.self_attn.q_proj.weight', 'vision_model.encoder.layers.18.self_attn.k_proj.bias', 'vision_model.encoder.layers.18.self_attn.k_proj.weight', 'vision_model.encoder.layers.21.self_attn.v_proj.weight', 'vision_model.encoder.layers.20.self_attn.k_proj.bias', 'vision_model.encoder.layers.0.mlp.fc1.weight', 'vision_model.encoder.layers.3.layer_norm2.weight', 'vision_model.encoder.layers.20.self_attn.k_proj.weight', 'vision_model.encoder.layers.21.layer_norm2.weight', 'vision_model.encoder.layers.4.layer_norm1.weight', 'vision_model.encoder.layers.7.mlp.fc1.bias', 'vision_model.encoder.layers.23.self_attn.out_proj.bias', 'vision_model.encoder.layers.10.self_attn.k_proj.bias', 'vision_model.encoder.layers.7.self_attn.v_proj.weight', 'vision_model.encoder.layers.15.mlp.fc2.bias', 'vision_model.encoder.layers.23.layer_norm1.bias', 'vision_model.encoder.layers.9.self_attn.q_proj.weight', 'vision_model.encoder.layers.4.self_attn.out_proj.bias', 'vision_model.encoder.layers.13.self_attn.out_proj.weight', 'vision_model.encoder.layers.20.mlp.fc2.bias', 'vision_model.encoder.layers.20.layer_norm2.bias', 'vision_model.encoder.layers.23.mlp.fc1.weight', 'vision_model.encoder.layers.7.self_attn.k_proj.bias', 'vision_model.encoder.layers.17.layer_norm1.weight', 'vision_model.encoder.layers.17.self_attn.out_proj.bias', 'vision_model.encoder.layers.0.layer_norm2.weight', 'vision_model.encoder.layers.9.self_attn.q_proj.bias', 'vision_model.encoder.layers.16.mlp.fc1.weight', 'vision_model.encoder.layers.9.mlp.fc2.weight', 'vision_model.encoder.layers.20.self_attn.out_proj.bias', 'vision_model.encoder.layers.14.self_attn.out_proj.weight', 'vision_model.encoder.layers.21.layer_norm1.bias', 'vision_model.encoder.layers.14.layer_norm1.weight', 'vision_model.encoder.layers.23.self_attn.q_proj.bias', 'vision_model.encoder.layers.0.self_attn.out_proj.weight', 'vision_model.encoder.layers.7.layer_norm2.bias', 'vision_model.encoder.layers.4.self_attn.q_proj.bias', 'vision_model.encoder.layers.7.self_attn.out_proj.weight', 'vision_model.encoder.layers.16.self_attn.out_proj.bias', 'vision_model.encoder.layers.1.self_attn.k_proj.bias', 'vision_model.encoder.layers.12.mlp.fc1.bias', 'vision_model.encoder.layers.13.self_attn.out_proj.bias', 'vision_model.encoder.layers.6.layer_norm1.weight', 'vision_model.embeddings.class_embedding', 'vision_model.encoder.layers.22.self_attn.q_proj.weight', 'vision_model.encoder.layers.0.layer_norm2.bias', 'vision_model.encoder.layers.3.mlp.fc1.weight', 'vision_model.encoder.layers.18.self_attn.q_proj.bias', 'vision_model.encoder.layers.3.mlp.fc2.weight', 'vision_model.encoder.layers.1.layer_norm2.weight', 'vision_model.encoder.layers.3.self_attn.k_proj.weight', 'vision_model.encoder.layers.10.self_attn.out_proj.bias', 'vision_model.encoder.layers.13.self_attn.v_proj.weight', 'vision_model.encoder.layers.0.self_attn.out_proj.bias', 'vision_model.encoder.layers.14.self_attn.k_proj.weight', 'vision_model.encoder.layers.0.self_attn.v_proj.weight', 'vision_model.encoder.layers.2.mlp.fc2.weight', 'text_projection.weight', 'vision_model.encoder.layers.11.layer_norm2.weight', 'vision_model.encoder.layers.2.mlp.fc1.bias', 'vision_model.encoder.layers.14.self_attn.k_proj.bias', 'vision_model.encoder.layers.2.self_attn.k_proj.weight', 'vision_model.encoder.layers.10.self_attn.k_proj.weight', 'vision_model.encoder.layers.9.self_attn.out_proj.weight', 'vision_model.encoder.layers.5.self_attn.k_proj.bias', 'vision_model.encoder.layers.4.self_attn.q_proj.weight', 'vision_model.encoder.layers.20.self_attn.q_proj.bias', 'vision_model.encoder.layers.9.self_attn.v_proj.bias', 'vision_model.encoder.layers.15.mlp.fc1.bias', 'vision_model.encoder.layers.13.mlp.fc2.weight', 'visual_projection.weight', 'vision_model.encoder.layers.0.self_attn.q_proj.weight', 'vision_model.encoder.layers.6.mlp.fc1.bias', 'vision_model.encoder.layers.8.self_attn.k_proj.weight', 'vision_model.encoder.layers.16.self_attn.k_proj.bias', 'vision_model.encoder.layers.19.mlp.fc2.bias', 'vision_model.encoder.layers.17.layer_norm1.bias', 'vision_model.encoder.layers.8.layer_norm1.weight', 'vision_model.encoder.layers.12.mlp.fc2.weight', 'vision_model.encoder.layers.8.self_attn.out_proj.bias', 'vision_model.encoder.layers.11.self_attn.v_proj.bias', 'vision_model.encoder.layers.20.layer_norm1.weight', 'vision_model.encoder.layers.19.self_attn.v_proj.weight', 'vision_model.encoder.layers.9.layer_norm1.bias', 'vision_model.encoder.layers.19.mlp.fc2.weight', 'vision_model.encoder.layers.12.self_attn.out_proj.bias', 'vision_model.encoder.layers.15.layer_norm2.weight', 'vision_model.encoder.layers.21.layer_norm2.bias', 'vision_model.encoder.layers.3.mlp.fc1.bias', 'vision_model.encoder.layers.11.layer_norm1.weight', 'vision_model.encoder.layers.21.self_attn.q_proj.bias', 'vision_model.encoder.layers.3.layer_norm1.weight', 'vision_model.encoder.layers.18.self_attn.out_proj.bias', 'vision_model.encoder.layers.17.mlp.fc2.bias', 'vision_model.encoder.layers.1.self_attn.k_proj.weight', 'vision_model.encoder.layers.2.self_attn.k_proj.bias', 'vision_model.encoder.layers.6.mlp.fc2.bias', 'vision_model.encoder.layers.23.mlp.fc2.bias', 'vision_model.encoder.layers.3.self_attn.v_proj.weight', 'vision_model.encoder.layers.4.layer_norm2.weight', 'vision_model.pre_layrnorm.weight', 'vision_model.encoder.layers.1.mlp.fc2.bias', 'vision_model.encoder.layers.6.self_attn.q_proj.weight', 'vision_model.encoder.layers.8.self_attn.k_proj.bias', 'vision_model.encoder.layers.23.self_attn.q_proj.weight', 'vision_model.encoder.layers.0.self_attn.q_proj.bias', 'vision_model.encoder.layers.1.self_attn.out_proj.bias', 'vision_model.encoder.layers.5.self_attn.k_proj.weight', 'vision_model.encoder.layers.7.self_attn.q_proj.weight', 'vision_model.encoder.layers.20.mlp.fc1.bias', 'vision_model.encoder.layers.6.layer_norm2.bias', 'vision_model.encoder.layers.3.layer_norm2.bias', 'vision_model.encoder.layers.2.mlp.fc2.bias', 'vision_model.encoder.layers.19.self_attn.q_proj.bias', 'vision_model.encoder.layers.2.self_attn.v_proj.weight', 'vision_model.encoder.layers.11.mlp.fc1.bias', 'vision_model.encoder.layers.22.layer_norm2.bias', 'vision_model.encoder.layers.14.self_attn.v_proj.weight', 'vision_model.encoder.layers.1.self_attn.v_proj.bias', 'vision_model.encoder.layers.1.self_attn.q_proj.bias', 'vision_model.encoder.layers.6.self_attn.out_proj.bias', 'vision_model.encoder.layers.15.self_attn.v_proj.weight', 'vision_model.encoder.layers.3.self_attn.k_proj.bias', 'vision_model.encoder.layers.9.mlp.fc2.bias', 'vision_model.encoder.layers.14.self_attn.q_proj.weight', 'vision_model.encoder.layers.1.layer_norm2.bias', 'vision_model.encoder.layers.17.mlp.fc2.weight', 'vision_model.encoder.layers.4.self_attn.k_proj.weight', 'vision_model.encoder.layers.12.layer_norm1.bias', 'vision_model.encoder.layers.9.mlp.fc1.bias', 'vision_model.encoder.layers.5.mlp.fc1.weight', 'vision_model.encoder.layers.10.self_attn.q_proj.bias', 'vision_model.encoder.layers.15.layer_norm1.bias', 'vision_model.encoder.layers.11.layer_norm1.bias', 'vision_model.encoder.layers.19.self_attn.q_proj.weight', 'vision_model.encoder.layers.22.layer_norm1.bias', 'vision_model.encoder.layers.15.mlp.fc2.weight', 'vision_model.encoder.layers.2.layer_norm2.weight', 'vision_model.encoder.layers.19.self_attn.v_proj.bias', 'vision_model.encoder.layers.3.mlp.fc2.bias', 'vision_model.encoder.layers.17.mlp.fc1.weight', 'vision_model.encoder.layers.5.self_attn.v_proj.weight', 'vision_model.encoder.layers.2.mlp.fc1.weight', 'vision_model.encoder.layers.22.self_attn.out_proj.weight', 'vision_model.encoder.layers.5.self_attn.out_proj.bias', 'vision_model.encoder.layers.11.self_attn.out_proj.bias', 'vision_model.encoder.layers.14.mlp.fc1.weight', 'vision_model.encoder.layers.17.mlp.fc1.bias', 'vision_model.encoder.layers.13.self_attn.q_proj.weight', 'vision_model.encoder.layers.11.mlp.fc2.bias', 'vision_model.encoder.layers.6.self_attn.v_proj.bias', 'vision_model.encoder.layers.10.mlp.fc2.bias', 'vision_model.encoder.layers.16.self_attn.q_proj.weight', 'vision_model.encoder.layers.15.mlp.fc1.weight', 'vision_model.encoder.layers.5.mlp.fc2.weight', 'vision_model.encoder.layers.13.layer_norm2.bias', 'vision_model.encoder.layers.18.self_attn.v_proj.weight', 'vision_model.post_layernorm.bias', 'vision_model.encoder.layers.10.layer_norm1.bias', 'vision_model.embeddings.position_ids', 'vision_model.encoder.layers.18.mlp.fc1.bias', 'vision_model.embeddings.patch_embedding.weight', 'vision_model.encoder.layers.13.layer_norm1.weight', 'vision_model.encoder.layers.16.mlp.fc2.weight', 'vision_model.encoder.layers.8.mlp.fc1.bias', 'vision_model.encoder.layers.23.mlp.fc2.weight', 'vision_model.encoder.layers.14.self_attn.out_proj.bias', 'vision_model.encoder.layers.20.self_attn.v_proj.bias', 'vision_model.encoder.layers.22.self_attn.out_proj.bias', 'vision_model.encoder.layers.4.layer_norm1.bias', 'vision_model.encoder.layers.16.self_attn.v_proj.bias', 'vision_model.encoder.layers.19.self_attn.k_proj.bias', 'vision_model.encoder.layers.19.layer_norm1.bias', 'vision_model.encoder.layers.5.self_attn.q_proj.bias', 'vision_model.encoder.layers.0.mlp.fc1.bias', 'vision_model.encoder.layers.18.layer_norm2.bias', 'vision_model.encoder.layers.1.layer_norm1.bias', 'vision_model.encoder.layers.8.self_attn.out_proj.weight', 'vision_model.encoder.layers.6.layer_norm2.weight', 'vision_model.encoder.layers.7.layer_norm1.weight', 'vision_model.encoder.layers.5.layer_norm1.weight', 'vision_model.encoder.layers.19.layer_norm1.weight', 'vision_model.encoder.layers.19.self_attn.out_proj.weight', 'vision_model.encoder.layers.20.mlp.fc1.weight', 'vision_model.encoder.layers.7.self_attn.k_proj.weight', 'vision_model.encoder.layers.17.layer_norm2.weight', 'vision_model.encoder.layers.21.self_attn.out_proj.bias', 'vision_model.encoder.layers.15.self_attn.v_proj.bias', 'vision_model.encoder.layers.15.layer_norm1.weight', 'vision_model.encoder.layers.6.mlp.fc1.weight', 'vision_model.embeddings.position_embedding.weight', 'vision_model.encoder.layers.4.self_attn.k_proj.bias', 'vision_model.encoder.layers.2.self_attn.out_proj.bias', 'vision_model.encoder.layers.12.mlp.fc2.bias', 'vision_model.encoder.layers.5.mlp.fc1.bias', 'vision_model.encoder.layers.5.layer_norm2.weight', 'vision_model.encoder.layers.21.mlp.fc1.weight', 'vision_model.encoder.layers.16.layer_norm2.bias', 'vision_model.encoder.layers.21.self_attn.k_proj.bias', 'vision_model.encoder.layers.10.self_attn.out_proj.weight', 'vision_model.encoder.layers.14.layer_norm2.bias', 'vision_model.encoder.layers.18.layer_norm1.bias', 'vision_model.post_layernorm.weight', 'vision_model.encoder.layers.3.self_attn.q_proj.weight', 'vision_model.encoder.layers.22.mlp.fc2.bias', 'vision_model.encoder.layers.22.self_attn.q_proj.bias', 'vision_model.encoder.layers.15.self_attn.q_proj.bias', 'vision_model.encoder.layers.13.layer_norm1.bias', 'vision_model.encoder.layers.8.mlp.fc1.weight', 'vision_model.encoder.layers.20.self_attn.q_proj.weight', 'vision_model.encoder.layers.9.self_attn.k_proj.weight', 'vision_model.encoder.layers.9.self_attn.out_proj.bias', 'vision_model.encoder.layers.21.mlp.fc1.bias', 'vision_model.encoder.layers.12.self_attn.v_proj.bias', 'vision_model.encoder.layers.11.self_attn.k_proj.weight', 'vision_model.encoder.layers.17.self_attn.k_proj.bias', 'vision_model.encoder.layers.21.mlp.fc2.bias', 'vision_model.encoder.layers.17.self_attn.out_proj.weight', 'vision_model.encoder.layers.20.self_attn.v_proj.weight', 'vision_model.encoder.layers.0.layer_norm1.bias', 'vision_model.encoder.layers.4.self_attn.v_proj.bias', 'vision_model.encoder.layers.23.self_attn.out_proj.weight', 'vision_model.encoder.layers.6.self_attn.k_proj.bias', 'vision_model.encoder.layers.22.self_attn.v_proj.bias', 'vision_model.encoder.layers.4.mlp.fc1.weight', 'vision_model.encoder.layers.20.layer_norm1.bias', 'vision_model.encoder.layers.10.layer_norm2.bias', 'vision_model.encoder.layers.23.self_attn.v_proj.bias']
__init__
. Instead, either 1) use a type annotation in the class body, or 2) wrap the type in torch.jit.Attribute
.
warnings.warn("The TorchScript type system doesn't support "
No vmfb found. Compiling and saving to D:\Nod\SHARK-main\unet1_77_512_512_fp16_E_Nod_anything_v4_0_anything_v4_0_pruned_fp32_vulkan-00000000-0900-0000-0000-000000000000.vmfb
Using target triple -iree-vulkan-target-triple=rdna3-7900-windows from command line args
Saved vmfb in D:\Nod\SHARK-main\unet1_77_512_512_fp16_E_Nod_anything_v4_0_anything_v4_0_pruned_fp32_vulkan-00000000-0900-0000-0000-000000000000.vmfb.
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
No vmfb found. Compiling and saving to D:\Nod\SHARK-main\vae1_77_512_512_fp16_E_Nod_anything_v4_0_anything_v4_0_pruned_fp32_vulkan-00000000-0900-0000-0000-000000000000.vmfb
Using target triple -iree-vulkan-target-triple=rdna3-7900-windows from command line args
Saved vmfb in D:\Nod\SHARK-main\vae1_77_512_512_fp16_E_Nod_anything_v4_0_anything_v4_0_pruned_fp32_vulkan-00000000-0900-0000-0000-000000000000.vmfb.
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
No vmfb found. Compiling and saving to D:\Nod\SHARK-main\clip1_77_512_512_fp16_E_Nod_anything_v4_0_anything_v4_0_pruned_fp32_vulkan-00000000-0900-0000-0000-000000000000.vmfb
Using target triple -iree-vulkan-target-triple=rdna3-7900-windows from command line args
Saved vmfb in D:\Nod\SHARK-main\clip1_77_512_512_fp16_E_Nod_anything_v4_0_anything_v4_0_pruned_fp32_vulkan-00000000-0900-0000-0000-000000000000.vmfb.
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.Stats for run 0: Average step time: 276.8403434753418ms/it Clip Inference time (ms) = 33.007 VAE Inference time (ms): 731.695
Total image generation time: 14.624723196029663sec
basically, the wget command was failing to retrieve the inference yaml files when loading the custom checkpoint. once I manually placed them in the path it was looking for and using the --no-use-_tuned flag, image generation succeeded. I did see a runtime error though (RuntimeError: mat1 and mat2 shapes cannot be multiplied (154x1024 and 768x320)), and the process is very slow each time, probably because it's loading the checkpoint each time the command is run?
Here's the result when I do the HF model ID command: (shark.venv) PS D:\Nod\SHARK-main>> python D:\nod\shark-main\shark\examples\shark_inference\stable_diffusion\main.py --hf_model_id="andite/anything-v4.0" --max_length=77 --prompt="tajmahal, oil on canvas, sunflowers, 4k, uhd" --enable_stack_trace --no-use_tuned shark_tank local cache is located at C:\Users\M.local/shark_tank/ . You may change this by setting the --local_tank_cache= flag Running StableDiffusion with the following config :- Batch size : 1 Prompts : ['tajmahal, oil on canvas, sunflowers, 4k, uhd'] Runs : 1 Found device AMD Radeon RX 7900 XTX. Using target triple rdna3-7900-windows. Tuned models are currently not supported for this setting. Traceback (most recent call last): File "D:\nod\shark-main\shark\examples\shark_inference\stable_diffusion\main.py", line 111, in from opt_params import get_unet, get_vae, get_clip File "D:\nod\shark-main\shark\examples\shark_inference\stable_diffusion\opt_params.py", line 20, in variant, version = hf_model_variant_map[args.hf_model_id] KeyError: 'andite/anything-v4.0'
--import_mlir
flag in your run command.--ckpt_loc
, it's good to know it worked!
wget
issue - we'll take a dig at it.RuntimeError: mat1 and mat2 ...
) is expected - no issues here. That's why you'd observe that the program retries the compilation with some other base configuration for the model and the applies the weight.Running on RDNA3 (7900 XTX) w/ Adrenalin 23.1.2 driver.
Using the examples in shark/examples/shark_inference/stable_diffusion/README.md
as reference, I can confirm that:
--precision="fp16" --device="vulkan" --no-use_tuned --import_mlir --vulkan_large_heap_block_size=0
will get you started using a hugging face model.
In addition the the above, putting the compvis
Edit:
Just noticed that stable-diffusion/main/configs/stable-diffusion/v1-inference.yaml
file into shark/examples/shark_inference/stable-diffusion/
allows usage of local .ckpt files.v1-inference.yaml
seems to be tied to the working directory/where python
is called from.
I suppose the correct temp solution would be to place v1-inference.yaml
at the root folder of /shark
if you are following example commands.
- The process, as you observed, is slow because of loading the checkpoint each time the program is being run - we'll also take a dig into speeding it up.
Looking forward to it!
With regards to the wget issue on Windows 11, there may be an issue in that "wget" defaults as an alias for "Invoke-WebRequest -?" in powershell, I had to install my own "wget.exe" binary and make sure it was in the PATH environment variable
that should work too. We will try to submit a patch upstream : https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/convert_from_ckpt.py#L863-L879
@rdfriese : wget
issue has been resolved as part of diffusers but we've to wait for it to be rolled out in diffusers' next release.
@EvanGuanSF - the speeding up issue is currently being worked on and we can expect it to land in SHARK by today.
@MrManiacal - the main issue filed has been resolved so I'm closing this thread.
Feel free to open an issue in case you stumble upon any other bug.
Using the example .CKPT and command, I receive the following: (shark.venv) PS D:\Nod\SHARK-main\shark\examples\shark_inference\stable_diffusion> python main.py --precision=fp16 --device=vulkan --prompt="tajmahal, oil on canvas, sunflowers, 4k, uhd" --max_length=64 --import_mlir --ckpt_loc="D:/nod/anything-v4.0/" shark_tank local cache is located at C:\Users\M.local/shark_tank/ . You may change this by setting the --local_tank_cache= flag Running StableDiffusion with the following config :- Batch size : 1 Prompts : ['tajmahal, oil on canvas, sunflowers, 4k, uhd'] Runs : 1 Found device AMD Radeon RX 7900 XTX. Using target triple rdna3-7900-windows. Using vulkan://00000000-0900-0000-0000-000000000000 tuned models for stablediffusion/fp16. D:\nod\shark-main\shark.venv\lib\site-packages\torch\jit_check.py:181: UserWarning: The TorchScript type system doesn't support instance-level annotations on empty non-base types in
__init__
. Instead, either 1) use a type annotation in the class body, or 2) wrap the type intorch.jit.Attribute
. warnings.warn("The TorchScript type system doesn't support " loading existing vmfb from: D:\Nod\SHARK-main\shark\examples\shark_inference\stable_diffusion\clip1_64_512_512_fp16_D_nod_anything_v4_0__vulkan-00000000-0900-0000-0000-000000000000.vmfb WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files. Retrying with a different base model configuration Retrying with a different base model configuration Cannot compile the model. Please useenable_stack_trace
and create an issue at https://github.com/nod-ai/SHARK/issuesWhen trying to use a HuggingFace model ID, I receive the following: (shark.venv) PS D:\Nod\SHARK-main\shark\examples\shark_inference\stable_diffusion> python main.py --precision=fp16 --device=vulkan --prompt="tajmahal, oil on canvas, sunflowers, 4k, uhd" --max_length=64 --import_mlir --ckpt_loc="D:/nod/anything-v4.0/" shark_tank local cache is located at C:\Users\M.local/shark_tank/ . You may change this by setting the --local_tank_cache= flag Running StableDiffusion with the following config :- Batch size : 1 Prompts : ['tajmahal, oil on canvas, sunflowers, 4k, uhd'] Runs : 1 Found device AMD Radeon RX 7900 XTX. Using target triple rdna3-7900-windows. Using vulkan://00000000-0900-0000-0000-000000000000 tuned models for stablediffusion/fp16. D:\nod\shark-main\shark.venv\lib\site-packages\torch\jit_check.py:181: UserWarning: The TorchScript type system doesn't support instance-level annotations on empty non-base types in
__init__
. Instead, either 1) use a type annotation in the class body, or 2) wrap the type intorch.jit.Attribute
. warnings.warn("The TorchScript type system doesn't support " loading existing vmfb from: D:\Nod\SHARK-main\shark\examples\shark_inference\stable_diffusion\clip1_64_512_512_fp16_D_nod_anything_v4_0__vulkan-00000000-0900-0000-0000-000000000000.vmfb WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files. Retrying with a different base model configuration Retrying with a different base model configuration Cannot compile the model. Please useenable_stack_trace
and create an issue at https://github.com/nod-ai/SHARK/issues