nod-ai / SHARK

SHARK - High Performance Machine Learning Distribution
Apache License 2.0
1.41k stars 168 forks source link

539 fails to compile the model #1051

Open ride5k opened 1 year ago

ride5k commented 1 year ago

brand new user here... day 1. having some problems getting any output at all. followed instructions on github, downloaded latest amd drivers (have an rx6700xt), install went fine. however every attempt at just clicking "generate image" with default params regardless of model selected fails with error. for example:

Found device AMD Radeon RX 6700 XT. Using target triple rdna2-unknown-windows.
Using tuned models for prompthero/openjourney/fp16/vulkan://00000000-0a00-0000-0000-000000000000.
Downloading (…)cheduler_config.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 308/308 [00:00<00:00, 308kB/s]
loading existing vmfb from: C:\shark\euler_scale_model_input_1_512_512fp16.vmfb
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
loading existing vmfb from: C:\shark\euler_step_1_512_512fp16.vmfb
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
Inferring base model configuration.
Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with:

pip install accelerate

.
Downloading (…)_model.safetensors";: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3.44G/3.44G [10:00<00:00, 5.72MB/s]
Downloading (…)ain/unet/config.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 743/743 [00:00<00:00, 743kB/s]
Retrying with a different base model configuration
Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with:

pip install accelerate

.
Loading Winograd config file from  C:\Users\gilberts\.local/shark_tank/configs/unet_winograd_vulkan.json
Retrying with a different base model configuration
Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with:

pip install accelerate

.
Retrying with a different base model configuration
Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with:

pip install accelerate

.
Retrying with a different base model configuration
Traceback (most recent call last):
  File "gradio\routes.py", line 374, in run_predict
  File "gradio\blocks.py", line 1017, in process_api
  File "gradio\blocks.py", line 835, in call_function
  File "anyio\to_thread.py", line 31, in run_sync
  File "anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread
  File "anyio\_backends\_asyncio.py", line 867, in run
  File "apps\stable_diffusion\scripts\txt2img.py", line 116, in txt2img_inf
  File "apps\stable_diffusion\src\pipelines\pipeline_shark_stable_diffusion_utils.py", line 220, in from_pretrained
  File "apps\stable_diffusion\src\models\model_wrappers.py", line 383, in __call__
SystemExit: Cannot compile the model. Please create an issue with the detailed log at https://github.com/nod-ai/SHARK/issues
Found device AMD Radeon RX 6700 XT. Using target triple rdna2-unknown-windows.
Using tuned models for wavymulder/Analog-Diffusion/fp16/vulkan://00000000-0a00-0000-0000-000000000000.
Downloading (…)cheduler_config.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 346/346 [00:00<00:00, 346kB/s]
loading existing vmfb from: C:\shark\euler_scale_model_input_1_512_512fp16.vmfb
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
loading existing vmfb from: C:\shark\euler_step_1_512_512fp16.vmfb
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
Inferring base model configuration.
Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with:

pip install accelerate

.
Downloading (…)_pytorch_model.bin";: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3.44G/3.44G [09:53<00:00, 5.80MB/s]
Downloading (…)ain/unet/config.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 877/877 [00:00<00:00, 876kB/s]
Retrying with a different base model configuration
Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with:

pip install accelerate

.
Loading Winograd config file from  C:\Users\gilberts\.local/shark_tank/configs/unet_winograd_vulkan.json
Retrying with a different base model configuration
Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with:

pip install accelerate

.
Retrying with a different base model configuration
Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with:

pip install accelerate

.
Retrying with a different base model configuration
Traceback (most recent call last):
  File "gradio\routes.py", line 374, in run_predict
  File "gradio\blocks.py", line 1017, in process_api
  File "gradio\blocks.py", line 835, in call_function
  File "anyio\to_thread.py", line 31, in run_sync
  File "anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread
  File "anyio\_backends\_asyncio.py", line 867, in run
  File "apps\stable_diffusion\scripts\txt2img.py", line 116, in txt2img_inf
  File "apps\stable_diffusion\src\pipelines\pipeline_shark_stable_diffusion_utils.py", line 220, in from_pretrained
  File "apps\stable_diffusion\src\models\model_wrappers.py", line 383, in __call__
SystemExit: Cannot compile the model. Please create an issue with the detailed log at https://github.com/nod-ai/SHARK/issues
yzhang93 commented 1 year ago

Can you try with --clear_all flag?

c0mpylicious commented 1 year ago

I have the same issue on fresh attempt at installing.

Pip Accelerate is already in my system, but not being used.

No directory structure is created apart from the models folder. Nothing is downloaded.

"--clear_all" tag clears the files but repeats the issue again on that run. Tried custom checkpoints, tried 1-4 model as custom checkpoint. Constant fail.

Have no idea whatsoever how to find the Model Ids to try them (why isn't there a drop-down box? or why can't I just put them in a folder to use?).

yzhang93 commented 1 year ago

@c0mpylicious There is a drop-down box to select models. Can you try to select models from there? Can you try if "stabilityai/stable-diffusion-2-1-base" works?

and8928 commented 1 year ago

@c0mpylicious There is a drop-down box to select models. Can you try to select models from there? Can you try if "stabilityai/stable-diffusion-2-1-base" works?

it does ont work with any models

consolation1 commented 1 year ago

Have a look at the solution to my issue, you seem to be getting same errors.

c0mpylicious commented 1 year ago

My comment about drop-down boxes was about having one for the Huggingface Model ID part of the form.

I did try a few of the drop-down models on the left. They all end up with same error as above. Although they did attempt to download different files; "model.safetensors (3.46gb)" downloaded when attempting stable-diffusion-2-1-base. Still led to the same errors. Using a pre-downloaded sd-v1-4.ckpt as a custom model also led to the same error.

Running the "pip install accelerate" again in the venv did remove that warning at least.

I will note that I am now running the git cloned repository instead of the .exe as the .exe didn't seem to do anything except launch the gradio, which had nothing (no files) to work with.

I will attempt the solution by @consolation1 now.

Edit: Works! Tried a comparison prompt + settings to the previous Onnx setup I had on this Windows OS and the pictures are nearly identical.

consolation1 commented 1 year ago

Credit goes to @yzhang93 for figuring it out.

nonemouse commented 1 year ago

I am still seeing this error on a Radeon 6600 when using a HuggingFace model id, even after adding --local_tank_cache=.

Using stabilityai/stable-diffusion-2-1 from the dropdown works:

Using C:\shark\cache as local shark_tank cache directory.
vulkan devices are available.
cuda devices are not available.
Running on local URL:  http://0.0.0.0:8080

To create a public link, set `share=True` in `launch()`.
Found device AMD Radeon RX 6600. Using target triple rdna2-unknown-windows.
Using tuned models for stabilityai/stable-diffusion-2-1/fp16/vulkan://00000000-0a00-0000-0000-000000000000.
torch\jit\_check.py:172: UserWarning: The TorchScript type system doesn't support instance-level annotations on empty non-base types in `__init__`. Instead, either 1) use a type annotation in the class body, or 2) wrap the type in `torch.jit.Attribute`.
  warnings.warn("The TorchScript type system doesn't support "
loading existing vmfb from: C:\shark\euler_scale_model_input_1_512_512fp16.vmfb
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
loading existing vmfb from: C:\shark\euler_step_1_512_512fp16.vmfb
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
Loaded vmfbs from cache and successfully fetched base model configuration.
50it [00:13,  3.67it/s]
100%|████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  1.94it/s]

But trying to load hakurei/waifu-diffusion fails:

Using C:\shark\cache as local shark_tank cache directory.
vulkan devices are available.
cuda devices are not available.
Running on local URL:  http://0.0.0.0:8080

To create a public link, set `share=True` in `launch()`.
Found device AMD Radeon RX 6600. Using target triple rdna2-unknown-windows.
Using tuned models for hakurei/waifu-diffusion/fp16/vulkan://00000000-0a00-0000-0000-000000000000.
torch\jit\_check.py:172: UserWarning: The TorchScript type system doesn't support instance-level annotations on empty non-base types in `__init__`. Instead, either 1) use a type annotation in the class body, or 2) wrap the type in `torch.jit.Attribute`.
  warnings.warn("The TorchScript type system doesn't support "
loading existing vmfb from: C:\shark\euler_scale_model_input_1_512_512fp16.vmfb
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
loading existing vmfb from: C:\shark\euler_step_1_512_512fp16.vmfb
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
Inferring base model configuration.
Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with:

pip install accelerate

.
torch\fx\node.py:250: UserWarning: Trying to prepend a node to itself. This behavior has no effect on the graph.
  warnings.warn("Trying to prepend a node to itself. This behavior has no effect on the graph.")
Loading Winograd config file from  C:\shark\cacheconfigs/unet_winograd_vulkan.json
100%|███████████████████████████████████████████████████████████████████████████████████| 107/107 [00:00<00:00, 811B/s]
100%|█████████████████████████████████████████████████████████████████████████████████| 107/107 [00:00<00:00, 1.10kB/s]
Retrying with a different base model configuration
Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with:

pip install accelerate

.
Retrying with a different base model configuration
Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with:

pip install accelerate

.
Retrying with a different base model configuration
Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with:

pip install accelerate

.
Retrying with a different base model configuration
Traceback (most recent call last):
  File "gradio\routes.py", line 374, in run_predict
  File "gradio\blocks.py", line 1017, in process_api
  File "gradio\blocks.py", line 835, in call_function
  File "anyio\to_thread.py", line 31, in run_sync
  File "anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread
  File "anyio\_backends\_asyncio.py", line 867, in run
  File "apps\stable_diffusion\scripts\txt2img.py", line 116, in txt2img_inf
  File "apps\stable_diffusion\src\pipelines\pipeline_shark_stable_diffusion_utils.py", line 220, in from_pretrained
  File "apps\stable_diffusion\src\models\model_wrappers.py", line 383, in __call__
SystemExit: Cannot compile the model. Please create an issue with the detailed log at https://github.com/nod-ai/SHARK/issues

I can't find any more detailed log.

yzhang93 commented 1 year ago

@nonemouse This path C:\shark\cacheconfigs/unet_winograd_vulkan.json is not correct. Make sure you pass in --local_tank_cache=C:/shark/cache/

nonemouse commented 1 year ago

@yzhang93 , thanks for the help. I tried your suggestion and it does not seem to make a difference beyond changing that path to C:/shark/cache/configs/unet_winograd_vulkan.json:

Using C:/shark/cache/ as local shark_tank cache directory.
vulkan devices are available.
cuda devices are not available.
Running on local URL:  http://0.0.0.0:8080

To create a public link, set `share=True` in `launch()`.
Found device AMD Radeon RX 6600. Using target triple rdna2-unknown-windows.
Using tuned models for hakurei/waifu-diffusion/fp16/vulkan://00000000-0a00-0000-0000-000000000000.
torch\jit\_check.py:172: UserWarning: The TorchScript type system doesn't support instance-level annotations on empty non-base types in `__init__`. Instead, either 1) use a type annotation in the class body, or 2) wrap the type in `torch.jit.Attribute`.
  warnings.warn("The TorchScript type system doesn't support "
loading existing vmfb from: C:\shark\euler_scale_model_input_1_512_512fp16.vmfb
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
loading existing vmfb from: C:\shark\euler_step_1_512_512fp16.vmfb
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
Inferring base model configuration.
Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with:

pip install accelerate

.
torch\fx\node.py:250: UserWarning: Trying to prepend a node to itself. This behavior has no effect on the graph.
  warnings.warn("Trying to prepend a node to itself. This behavior has no effect on the graph.")
Loading Winograd config file from  C:/shark/cache/configs/unet_winograd_vulkan.json
100%|█████████████████████████████████████████████████████████████████████████████████| 107/107 [00:00<00:00, 1.47kB/s]
100%|█████████████████████████████████████████████████████████████████████████████████| 107/107 [00:00<00:00, 1.55kB/s]
Retrying with a different base model configuration
Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with:

pip install accelerate

.
Retrying with a different base model configuration
Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with:

pip install accelerate

.
Retrying with a different base model configuration
Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with:

pip install accelerate

.
Retrying with a different base model configuration
Traceback (most recent call last):
  File "gradio\routes.py", line 374, in run_predict
  File "gradio\blocks.py", line 1017, in process_api
  File "gradio\blocks.py", line 835, in call_function
  File "anyio\to_thread.py", line 31, in run_sync
  File "anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread
  File "anyio\_backends\_asyncio.py", line 867, in run
  File "apps\stable_diffusion\scripts\txt2img.py", line 116, in txt2img_inf
  File "apps\stable_diffusion\src\pipelines\pipeline_shark_stable_diffusion_utils.py", line 220, in from_pretrained
  File "apps\stable_diffusion\src\models\model_wrappers.py", line 383, in __call__
SystemExit: Cannot compile the model. Please create an issue with the detailed log at https://github.com/nod-ai/SHARK/issues
mrmak0030 commented 1 year ago

Just an FYI to above... Shark has been working for a while, but failed after a clean install of Shark git directories and clearing all my .local cache folders e.g. Hugginface, shark_tank etc. Then encountered the same issue "SystemExit: Cannot compile the model'' (using the nodai_shark git python repository and not exe). From the above discussion noticed that my "C://Users//@username//.local" no longer contained the directory "\shark_tank". Manually created this and it all now works. Windows 11 Pro.

consolation1 commented 1 year ago

I installed shark on another computer today, I had to manually create the directory for the cache location I was passing to it, or as mentioned above; manually adding the .local one worked without passing the cache location variable.

nonemouse commented 1 year ago

I tried creating the shark_tank directory in the default location also and it did not help; I still got the same error when i used a Huggingface model ID.

I also tried downloading a .ckpt file and putting it in the models directory and using that. This gave a different error:

Using C:/shark/cache/ as local shark_tank cache directory.
vulkan devices are available.
cuda devices are not available.
Running on local URL:  http://0.0.0.0:8080

To create a public link, set `share=True` in `launch()`.
Found device AMD Radeon RX 6600. Using target triple rdna2-unknown-windows.
Tuned models are currently not supported for this setting.
torch\jit\_check.py:172: UserWarning: The TorchScript type system doesn't support instance-level annotations on empty non-base types in `__init__`. Instead, either 1) use a type annotation in the class body, or 2) wrap the type in `torch.jit.Attribute`.
  warnings.warn("The TorchScript type system doesn't support "
loading existing vmfb from: C:\shark\euler_scale_model_input_1_512_512fp16.vmfb
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
loading existing vmfb from: C:\shark\euler_step_1_512_512fp16.vmfb
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
Diffusers' checkpoint will be identified here :  C:/shark/models/wd-1-4-anime_e2
Loading diffusers' pipeline from original stable diffusion checkpoint
global_step key not found in model
Downloading (…)_encoder/config.json: 100%|████████████████████████████████████████████████████| 633/633 [00:00<?, ?B/s]
huggingface_hub\file_download.py:129: UserWarning: `huggingface_hub` cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\Users\Username\.cache\huggingface\hub. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the `HF_HUB_DISABLE_SYMLINKS_WARNING` environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Downloading (…)"model.safetensors";: 100%|████████████████████████████████████████| 1.36G/1.36G [02:08<00:00, 10.6MB/s]
transformers\modeling_utils.py:402: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
torch\_utils.py:776: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  return self.fget.__get__(instance, owner)()
torch\storage.py:899: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  storage = cls(wrap_storage=untyped_storage)
safetensors\torch.py:98: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
Traceback (most recent call last):
  File "gradio\routes.py", line 374, in run_predict
  File "gradio\blocks.py", line 1017, in process_api
  File "gradio\blocks.py", line 835, in call_function
  File "anyio\to_thread.py", line 31, in run_sync
  File "anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread
  File "anyio\_backends\_asyncio.py", line 867, in run
  File "apps\stable_diffusion\scripts\txt2img.py", line 116, in txt2img_inf
  File "apps\stable_diffusion\src\pipelines\pipeline_shark_stable_diffusion_utils.py", line 220, in from_pretrained
  File "apps\stable_diffusion\src\models\model_wrappers.py", line 340, in __call__
  File "apps\stable_diffusion\src\utils\utils.py", line 427, in preprocessCKPT
  File "diffusers\pipelines\stable_diffusion\convert_from_ckpt.py", line 1097, in load_pipeline_from_original_stable_diffusion_ckpt
  File "diffusers\pipelines\stable_diffusion\convert_from_ckpt.py", line 789, in convert_open_clip_checkpoint
KeyError: 'cond_stage_model.model.text_projection'

I'm not sure if that is related or should be a different issue.

nonemouse commented 1 year ago

I tried again with the latest build 568, which fixes the issue where --local_tank_cache needs to be provided, but otherwise with that build I am still experiencing the same behavior:

nonemouse commented 1 year ago

After some more research I think the problem I am having, at least with the downloaded checkpoint, is this issue which looks like it was recently fixed in diffusers.