lshqqytiger / stable-diffusion-webui-amdgpu

Stable Diffusion web UI
GNU Affero General Public License v3.0
1.67k stars 175 forks source link

[Bug]: Controlnet: Depth_Hand_Refiner not working #421

Open Kerorowong opened 3 months ago

Kerorowong commented 3 months ago

Checklist

What happened?

I'm using Stable Diffusion version 1.8.0 RC. Trying to use dept_hand_refiner on controlnet and I get this error:- RuntimeError: CUDA error: when calling cusparseXcoo2csr(handle, coorowind, i_nnz, i_m, csrrowptr, CUSPARSE_INDEX_BASE_ZERO)

Does anyone manage to get this working?

Steps to reproduce the problem

  1. Go to Controlnet
  2. Upload the image with problematic hand eg. 4 fingers
  3. Choose Depth>Pre-processor: depth_hand_refiner > [control_v11f1p_sd15_depth]
  4. Click the explosion icon for the preview
  5. RuntimeError: CUDA error: when calling cusparseXcoo2csr(handle, coorowind, i_nnz, i_m, csrrowptr, CUSPARSE_INDEX_BASE_ZERO)

What should have happened?

A depth map of proper hand is displayed in the preview in depth map format

What browsers do you use to access the UI ?

Other

Sysinfo

https://1drv.ms/u/s!AvY15mBR3uvatDBxz2a1_R90y9x3?e=PjAhjM

Console logs

venv "F:\sd\stable-diffusion-webui-directml\venv\Scripts\Python.exe"
ROCm Toolkit was found.
fatal: No names found, cannot describe anything.
Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug  1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Version: 1.8.0-RC
Commit hash: 25a3b6cbeea8a07afd5e4594afc2f1c79f41ac1a
no module 'xformers'. Processing without...
no module 'xformers'. Processing without...
No module 'xformers'. Proceeding without it.
F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\pytorch_lightning\utilities\distributed.py:258: LightningDeprecationWarning: `pytorch_lightning.utilities.distributed.rank_zero_only` has been deprecated in v1.8.1 and will be removed in v2.0.0. You can import it from `pytorch_lightning.utilities` instead.
  rank_zero_deprecation(
Launching Web UI with arguments:
ONNX: selected=CUDAExecutionProvider, available=['AzureExecutionProvider', 'CPUExecutionProvider']
Tag Autocomplete: Could not locate model-keyword extension, Lora trigger word completion will be limited to those added through the extra networks menu.
[-] ADetailer initialized. version: 24.3.1, num models: 10
ControlNet preprocessor location: F:\sd\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\annotator\downloads
2024-03-20 00:44:54,179 - ControlNet - INFO - ControlNet v1.1.441
2024-03-20 00:44:54,401 - ControlNet - INFO - ControlNet v1.1.441
Loading weights [18ed2b6c48] from F:\sd\stable-diffusion-webui-directml\models\Stable-diffusion\xxmix9realistic_v40.safetensors
Creating model from config: F:\sd\stable-diffusion-webui-directml\configs\v1-inference.yaml
2024-03-20 00:44:58,547 - ControlNet - INFO - ControlNet UI callback registered.
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 22.3s (prepare environment: 25.2s, initialize shared: 3.8s, load scripts: 4.1s, initialize extra networks: 1.4s, scripts before_ui_callback: 0.2s, create ui: 3.1s, gradio launch: 0.3s).
Loading VAE weights specified in settings: F:\sd\stable-diffusion-webui-directml\models\VAE\vae-ft-mse-840000-ema-pruned.safetensors
Applying attention optimization: Doggettx... done.
Model loaded in 28.6s (load weights from disk: 3.8s, create model: 1.1s, apply weights to model: 17.0s, load VAE: 5.6s, load textual inversion embeddings: 0.3s, calculate empty prompt: 0.5s).
2024-03-20 00:45:58,342 - ControlNet - INFO - Preview Resolution = 512
set os.environ[OMP_NUM_THREADS] to 4
F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\scipy\sparse\_index.py:102: SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient.
  self._set_intXint(row, col, x.flat[0])
=> loading pretrained model F:\sd\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\annotator\downloads\hand_refiner\hr16/ControlNet-HandRefiner-pruned\hrnetv2_w64_imagenet_pretrained.pth
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
Traceback (most recent call last):
  File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\routes.py", line 488, in run_predict
    output = await app.get_blocks().process_api(
  File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\blocks.py", line 1431, in process_api
    result = await self.call_function(
  File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\blocks.py", line 1103, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\utils.py", line 707, in wrapper
    response = f(*args, **kwargs)
  File "F:\sd\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\scripts\controlnet_ui\controlnet_ui_group.py", line 1015, in run_annotator
    result, is_image = preprocessor(
  File "F:\sd\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\scripts\utils.py", line 81, in decorated_func
    return cached_func(*args, **kwargs)
  File "F:\sd\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\scripts\utils.py", line 65, in cached_func
    return func(*args, **kwargs)
  File "F:\sd\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\scripts\global_state.py", line 37, in unified_preprocessor
    return preprocessor_modules[preprocessor_name](*args, **kwargs)
  File "F:\sd\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\scripts\processor.py", line 861, in run_model
    depth_map, mask, info = self.model(
  File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\hand_refiner\__init__.py", line 29, in __call__
    depth_map, mask, info = self.pipeline.get_depth(input_image, mask_bbox_padding)
  File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\hand_refiner\pipeline.py", line 365, in get_depth
    cropped_depthmap, pred_2d_keypoints = self.run_inference(graphormer_input.astype(np.uint8), self._model, self.mano_model, self.mesh_sampler, scale, int(crop_len))
  File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\hand_refiner\pipeline.py", line 239, in run_inference
    pred_camera, pred_3d_joints, pred_vertices_sub, pred_vertices, hidden_states, att = Graphormer_model(batch_imgs, mano, mesh_sampler)
  File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\mesh_graphormer\modeling\bert\e2e_hand_network.py", line 37, in forward
    template_vertices_sub = mesh_sampler.downsample(template_vertices)
  File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\mesh_graphormer\modeling\_mano.py", line 141, in downsample
    y = spmm(self._D[j], y, self.device)
  File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\mesh_graphormer\modeling\util.py", line 26, in spmm
    return SparseMM.apply(sparse, dense)
  File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\torch\autograd\function.py", line 553, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
  File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\mesh_graphormer\modeling\util.py", line 11, in forward
    return torch.matmul(sparse, dense)
RuntimeError: CUDA error:  when calling `cusparseXcoo2csr(handle, coorowind, i_nnz, i_m, csrrowptr, CUSPARSE_INDEX_BASE_ZERO)`

Additional information

Using AMD RX6600 XT GPU

lshqqytiger commented 3 months ago

You seem to be using ZLUDA and there's a non-implemented function. I implemented it in v3.7-pre1. Download ZLUDA-windows-amd64.zip and unpack on your ZLUDA folder. (replace all existing files) And run

.\venv\Scripts\activate
pip uninstall torch -y

then try again.

Kerorowong commented 3 months ago

Microsoft Windows [Version 10.0.22631.3296] (c) Microsoft Corporation. All rights reserved.

F:\sd\stable-diffusion-webui-directml>venv\Scripts\activate

(venv) F:\sd\stable-diffusion-webui-directml>pip uninstall torch -y Found existing installation: torch 2.2.0+cu118 Uninstalling torch-2.2.0+cu118: Successfully uninstalled torch-2.2.0+cu118

(venv) F:\sd\stable-diffusion-webui-directml>webui-user.bat venv "F:\sd\stable-diffusion-webui-directml\venv\Scripts\Python.exe" ROCm Toolkit was found. fatal: No names found, cannot describe anything. Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug 1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)] Version: 1.8.0-RC Commit hash: 25a3b6cbeea8a07afd5e4594afc2f1c79f41ac1a Installing torch and torchvision Looking in indexes: https://download.pytorch.org/whl/cu118 Collecting torch==2.2.0 Using cached https://download.pytorch.org/whl/cu118/torch-2.2.0%2Bcu118-cp310-cp310-win_amd64.whl (2704.3 MB) Requirement already satisfied: torchvision==0.17.0 in f:\sd\stable-diffusion-webui-directml\venv\lib\site-packages (0.17.0+cu118) Requirement already satisfied: sympy in f:\sd\stable-diffusion-webui-directml\venv\lib\site-packages (from torch==2.2.0) (1.12) Requirement already satisfied: fsspec in f:\sd\stable-diffusion-webui-directml\venv\lib\site-packages (from torch==2.2.0) (2024.2.0) Requirement already satisfied: jinja2 in f:\sd\stable-diffusion-webui-directml\venv\lib\site-packages (from torch==2.2.0) (3.1.2) Requirement already satisfied: typing-extensions>=4.8.0 in f:\sd\stable-diffusion-webui-directml\venv\lib\site-packages (from torch==2.2.0) (4.10.0) Requirement already satisfied: filelock in f:\sd\stable-diffusion-webui-directml\venv\lib\site-packages (from torch==2.2.0) (3.9.0) Requirement already satisfied: networkx in f:\sd\stable-diffusion-webui-directml\venv\lib\site-packages (from torch==2.2.0) (3.2.1) Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in f:\sd\stable-diffusion-webui-directml\venv\lib\site-packages (from torchvision==0.17.0) (9.5.0) Requirement already satisfied: numpy in f:\sd\stable-diffusion-webui-directml\venv\lib\site-packages (from torchvision==0.17.0) (1.26.2) Requirement already satisfied: requests in f:\sd\stable-diffusion-webui-directml\venv\lib\site-packages (from torchvision==0.17.0) (2.28.1) Requirement already satisfied: MarkupSafe>=2.0 in f:\sd\stable-diffusion-webui-directml\venv\lib\site-packages (from jinja2->torch==2.2.0) (2.1.3) Requirement already satisfied: certifi>=2017.4.17 in f:\sd\stable-diffusion-webui-directml\venv\lib\site-packages (from requests->torchvision==0.17.0) (2022.12.7) Requirement already satisfied: charset-normalizer<3,>=2 in f:\sd\stable-diffusion-webui-directml\venv\lib\site-packages (from requests->torchvision==0.17.0) (2.1.1) Requirement already satisfied: urllib3<1.27,>=1.21.1 in f:\sd\stable-diffusion-webui-directml\venv\lib\site-packages (from requests->torchvision==0.17.0) (1.26.13) Requirement already satisfied: idna<4,>=2.5 in f:\sd\stable-diffusion-webui-directml\venv\lib\site-packages (from requests->torchvision==0.17.0) (3.4) Requirement already satisfied: mpmath>=0.19 in f:\sd\stable-diffusion-webui-directml\venv\lib\site-packages (from sympy->torch==2.2.0) (1.3.0) Installing collected packages: torch Successfully installed torch-2.2.0+cu118 WARNING: There was an error checking the latest version of pip. no module 'xformers'. Processing without... no module 'xformers'. Processing without... No module 'xformers'. Proceeding without it. F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\pytorch_lightning\utilities\distributed.py:258: LightningDeprecationWarning: pytorch_lightning.utilities.distributed.rank_zero_only has been deprecated in v1.8.1 and will be removed in v2.0.0. You can import it from pytorch_lightning.utilities instead. rank_zero_deprecation( Launching Web UI with arguments: ONNX: selected=CUDAExecutionProvider, available=['AzureExecutionProvider', 'CPUExecutionProvider'] Tag Autocomplete: Could not locate model-keyword extension, Lora trigger word completion will be limited to those added through the extra networks menu. [-] ADetailer initialized. version: 24.3.1, num models: 10 ControlNet preprocessor location: F:\sd\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\annotator\downloads 2024-03-20 10:30:09,886 - ControlNet - INFO - ControlNet v1.1.441 2024-03-20 10:30:10,690 - ControlNet - INFO - ControlNet v1.1.441 Loading weights [18ed2b6c48] from F:\sd\stable-diffusion-webui-directml\models\Stable-diffusion\xxmix9realistic_v40.safetensors Creating model from config: F:\sd\stable-diffusion-webui-directml\configs\v1-inference.yaml 2024-03-20 10:30:15,349 - ControlNet - INFO - ControlNet UI callback registered. Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch(). Startup time: 475.5s (prepare environment: 303.4s, initialize shared: 190.6s, other imports: 0.2s, list SD models: 0.4s, load scripts: 5.9s, initialize extra networks: 1.1s, scripts before_ui_callback: 0.2s, create ui: 3.9s, gradio launch: 0.4s). Loading VAE weights specified in settings: F:\sd\stable-diffusion-webui-directml\models\VAE\vae-ft-mse-840000-ema-pruned.safetensors Applying attention optimization: Doggettx... done. 2024-03-20 10:35:54,871 - ControlNet - INFO - Preview Resolution = 512 set os.environ[OMP_NUM_THREADS] to 4 F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\scipy\sparse_index.py:102: SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient. self._set_intXint(row, col, x.flat[0]) => loading pretrained model F:\sd\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\annotator\downloads\hand_refiner\hr16/ControlNet-HandRefiner-pruned\hrnetv2_w64_imagenet_pretrained.pth INFO: Created TensorFlow Lite XNNPACK delegate for CPU. Model loaded in 666.6s (load weights from disk: 4.0s, create model: 1.1s, apply weights to model: 34.9s, load VAE: 5.5s, load textual inversion embeddings: 127.0s, calculate empty prompt: 493.9s). 2024-03-20 10:45:37,776 - ControlNet - INFO - Preview Resolution = 512 Traceback (most recent call last): File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\routes.py", line 488, in run_predict output = await app.get_blocks().process_api( File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\blocks.py", line 1431, in process_api result = await self.call_function( File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\blocks.py", line 1103, in call_function prediction = await anyio.to_thread.run_sync( File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread return await future File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\anyio_backends_asyncio.py", line 807, in run result = context.run(func, args) File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\utils.py", line 707, in wrapper response = f(args, kwargs) File "F:\sd\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\scripts\controlnet_ui\controlnet_ui_group.py", line 1015, in run_annotator result, is_image = preprocessor( File "F:\sd\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\scripts\utils.py", line 81, in decorated_func return cached_func(*args, *kwargs) File "F:\sd\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\scripts\utils.py", line 65, in cached_func return func(args, kwargs) File "F:\sd\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\scripts\global_state.py", line 37, in unified_preprocessor return preprocessor_modules[preprocessor_name](*args, kwargs) File "F:\sd\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\scripts\processor.py", line 861, in run_model depth_map, mask, info = self.model( File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\hand_refiner__init.py", line 29, in call__ depth_map, mask, info = self.pipeline.get_depth(input_image, mask_bbox_padding) File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\hand_refiner\pipeline.py", line 365, in get_depth cropped_depthmap, pred_2d_keypoints = self.run_inference(graphormer_input.astype(np.uint8), self._model, self.mano_model, self.mesh_sampler, scale, int(crop_len)) File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\hand_refiner\pipeline.py", line 239, in run_inference pred_camera, pred_3d_joints, pred_vertices_sub, pred_vertices, hidden_states, att = Graphormer_model(batch_imgs, mano, mesh_sampler) File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl return forward_call(args, kwargs) File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\mesh_graphormer\modeling\bert\e2e_hand_network.py", line 37, in forward template_vertices_sub = mesh_sampler.downsample(template_vertices) File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\mesh_graphormer\modeling_mano.py", line 141, in downsample y = spmm(self._D[j], y, self.device) File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\mesh_graphormer\modeling\util.py", line 26, in spmm return SparseMM.apply(sparse, dense) File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\torch\autograd\function.py", line 553, in apply return super().apply(*args, **kwargs) # type: ignore[misc] File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\mesh_graphormer\modeling\util.py", line 11, in forward return torch.matmul(sparse, dense) RuntimeError: CUDA error: when calling cusparseCreateDnMat( &descB, kb, nb, ldb, b, cusparse_value_type, CUSPARSE_ORDER_COL )

The depth_hand_refiner is still not working in the end although it takes a long time to process.

Kerorowong commented 3 months ago

Even normal image generation at 512px x 512px is stuck and not working now

lshqqytiger commented 3 months ago

How about v3.7-pre3?

Kerorowong commented 3 months ago

How about v3.7-pre3?

The image generation still stuck without any response. Does image generation and controlnet - depth_hand_refiner works on your side?

lshqqytiger commented 3 months ago

I don't use controlnet. How long did you wait? and does ctrl+c work?

Kerorowong commented 3 months ago

I don't use controlnet. How long did you wait? and does ctrl+c work?

The image finally show up after 6 min. 15.2 sec., now I'm testing the controlnet, and still waiting

Kerorowong commented 3 months ago

The inpainting with controlnet depth_hand_refiner took 6 min. 41.3 sec. but no depth map is produce. I will continue to test it. Update: the depth_hand_refiner does not work, not depth map created, all black image.

lshqqytiger commented 3 months ago

There was a bug. Try v3.7-pre4.

Kerorowong commented 3 months ago

Image generation First Time taken: 29 min. 24.2 sec. Subsequent Time taken: 14.2 sec.

depth_hand_refiner controlnet still output black image only.

venv "F:\sd\stable-diffusion-webui-directml\venv\Scripts\Python.exe" ROCm Toolkit was found. fatal: No names found, cannot describe anything. Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug 1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)] Version: 1.8.0-RC Commit hash: 25a3b6cbeea8a07afd5e4594afc2f1c79f41ac1a no module 'xformers'. Processing without... no module 'xformers'. Processing without... No module 'xformers'. Proceeding without it. F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\pytorch_lightning\utilities\distributed.py:258: LightningDeprecationWarning: pytorch_lightning.utilities.distributed.rank_zero_only has been deprecated in v1.8.1 and will be removed in v2.0.0. You can import it from pytorch_lightning.utilities instead. rank_zero_deprecation( Launching Web UI with arguments: ONNX: selected=CUDAExecutionProvider, available=['AzureExecutionProvider', 'CPUExecutionProvider'] Tag Autocomplete: Could not locate model-keyword extension, Lora trigger word completion will be limited to those added through the extra networks menu. [-] ADetailer initialized. version: 24.3.1, num models: 10 ControlNet preprocessor location: F:\sd\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\annotator\downloads 2024-03-22 01:00:14,634 - ControlNet - INFO - ControlNet v1.1.441 2024-03-22 01:00:14,939 - ControlNet - INFO - ControlNet v1.1.441 Loading weights [84d76a0328] from F:\sd\stable-diffusion-webui-directml\models\Stable-diffusion\epicrealism_naturalSinRC1VAE.safetensors Creating model from config: F:\sd\stable-diffusion-webui-directml\configs\v1-inference.yaml 2024-03-22 01:00:17,630 - ControlNet - INFO - ControlNet UI callback registered. Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch(). Startup time: 215.0s (prepare environment: 38.3s, initialize shared: 192.8s, other imports: 0.1s, load scripts: 4.4s, initialize extra networks: 0.6s, create ui: 2.4s, gradio launch: 0.4s). Loading VAE weights specified in settings: F:\sd\stable-diffusion-webui-directml\models\VAE\vae-ft-mse-840000-ema-pruned.safetensors Applying attention optimization: Doggettx... done. Model loaded in 658.7s (load weights from disk: 2.2s, create model: 1.0s, apply weights to model: 26.1s, load VAE: 6.2s, load textual inversion embeddings: 126.5s, calculate empty prompt: 496.6s). 100%|█████████████████████████████████████████████| 20/20 [09:43<00:00, 29.17s/it] Total progress: 100%|█████████████████████████████| 20/20 [00:24<00:00, 1.23s/it] 100%|█████████████████████████████████████████████| 20/20 [00:13<00:00, 1.53it/s] Total progress: 100%|█████████████████████████████| 20/20 [00:13<00:00, 1.49it/s] Reusing loaded model epicrealism_naturalSinRC1VAE.safetensors [84d76a0328] to load xxmix9realistic_v40.safetensors [18ed2b6c48] Loading weights [18ed2b6c48] from F:\sd\stable-diffusion-webui-directml\models\Stable-diffusion\xxmix9realistic_v40.safetensors Loading VAE weights specified in settings: F:\sd\stable-diffusion-webui-directml\models\VAE\vae-ft-mse-840000-ema-pruned.safetensors Applying attention optimization: Doggettx... done. Weights loaded in 36.4s (send model to cpu: 0.9s, load weights from disk: 4.1s, apply weights to model: 30.6s, load VAE: 0.1s, move model to device: 0.6s). 100%|█████████████████████████████████████████████| 20/20 [00:13<00:00, 1.51it/s] Total progress: 100%|█████████████████████████████| 20/20 [00:13<00:00, 1.49it/s] 2024-03-22 01:40:43,947 - ControlNet - INFO - Preview Resolution = 512, 1.53it/s] set os.environ[OMP_NUM_THREADS] to 4 F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\scipy\sparse_index.py:102: SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient. self._set_intXint(row, col, x.flat[0]) => loading pretrained model F:\sd\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\annotator\downloads\hand_refiner\hr16/ControlNet-HandRefiner-pruned\hrnetv2_w64_imagenet_pretrained.pth INFO: Created TensorFlow Lite XNNPACK delegate for CPU. 2024-03-22 01:43:42,251 - ControlNet - INFO - Preview Resolution = 512 2024-03-22 01:43:43,254 - ControlNet - INFO - Preview Resolution = 512 2024-03-22 01:43:44,124 - ControlNet - INFO - Preview Resolution = 512 2024-03-22 01:43:44,847 - ControlNet - INFO - Preview Resolution = 512

Kerorowong commented 3 months ago

Inpainting with controlnet depth: depth_hand_refiner took 6 min. 23.7 sec. Result: No changes so it still doesn't work

venv "F:\sd\stable-diffusion-webui-directml\venv\Scripts\Python.exe" ROCm Toolkit was found. fatal: No names found, cannot describe anything. Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug 1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)] Version: 1.8.0-RC Commit hash: 25a3b6cbeea8a07afd5e4594afc2f1c79f41ac1a no module 'xformers'. Processing without... no module 'xformers'. Processing without... No module 'xformers'. Proceeding without it. F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\pytorch_lightning\utilities\distributed.py:258: LightningDeprecationWarning: pytorch_lightning.utilities.distributed.rank_zero_only has been deprecated in v1.8.1 and will be removed in v2.0.0. You can import it from pytorch_lightning.utilities instead. rank_zero_deprecation( Launching Web UI with arguments: ONNX: selected=CUDAExecutionProvider, available=['AzureExecutionProvider', 'CPUExecutionProvider'] Tag Autocomplete: Could not locate model-keyword extension, Lora trigger word completion will be limited to those added through the extra networks menu. [-] ADetailer initialized. version: 24.3.1, num models: 10 ControlNet preprocessor location: F:\sd\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\annotator\downloads 2024-03-22 01:46:34,177 - ControlNet - INFO - ControlNet v1.1.441 2024-03-22 01:46:34,318 - ControlNet - INFO - ControlNet v1.1.441 Loading weights [18ed2b6c48] from F:\sd\stable-diffusion-webui-directml\models\Stable-diffusion\xxmix9realistic_v40.safetensors 2024-03-22 01:46:34,699 - ControlNet - INFO - ControlNet UI callback registered. Creating model from config: F:\sd\stable-diffusion-webui-directml\configs\v1-inference.yaml Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch(). Startup time: 12.0s (prepare environment: 14.0s, initialize shared: 1.6s, load scripts: 3.4s, create ui: 0.8s, gradio launch: 0.4s). Loading VAE weights specified in settings: F:\sd\stable-diffusion-webui-directml\models\VAE\vae-ft-mse-840000-ema-pruned.safetensors Applying attention optimization: Doggettx... done. Model loaded in 5.2s (load weights from disk: 0.9s, create model: 0.4s, apply weights to model: 2.7s, load VAE: 0.2s, load textual inversion embeddings: 0.2s, calculate empty prompt: 0.5s). 100%|█████████████████████████████████████████████| 20/20 [00:13<00:00, 1.45it/s] Total progress: 100%|█████████████████████████████| 20/20 [00:13<00:00, 1.49it/s] 100%|█████████████████████████████████████████████| 20/20 [00:13<00:00, 1.53it/s] Total progress: 100%|█████████████████████████████| 20/20 [00:13<00:00, 1.49it/s] 2024-03-22 01:48:02,998 - ControlNet - INFO - unit_separate = False, style_align = False 2024-03-22 01:48:03,220 - ControlNet - INFO - Loading model: control_v11f1p_sd15_depth [cfd03158] 2024-03-22 01:48:16,070 - ControlNet - INFO - Loaded state_dict from [F:\sd\stable-diffusion-webui-directml\models\ControlNet\control_v11f1p_sd15_depth.pth] 2024-03-22 01:48:16,071 - ControlNet - INFO - controlnet_default_config 2024-03-22 01:48:18,948 - ControlNet - INFO - ControlNet model control_v11f1p_sd15_depth cfd03158 loaded. 2024-03-22 01:48:18,984 - ControlNet - INFO - Using preprocessor: depth_hand_refiner 2024-03-22 01:48:18,984 - ControlNet - INFO - preprocessor resolution = 512 set os.environ[OMP_NUM_THREADS] to 4 F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\scipy\sparse_index.py:102: SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient. self._set_intXint(row, col, x.flat[0]) => loading pretrained model F:\sd\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\annotator\downloads\hand_refiner\hr16/ControlNet-HandRefiner-pruned\hrnetv2_w64_imagenet_pretrained.pth INFO: Created TensorFlow Lite XNNPACK delegate for CPU. Error running process: F:\sd\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\scripts\controlnet.py Traceback (most recent call last): File "F:\sd\stable-diffusion-webui-directml\modules\scripts.py", line 784, in process script.process(p, script_args) File "F:\sd\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\scripts\controlnet.py", line 1279, in process self.controlnet_hack(p) File "F:\sd\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\scripts\controlnet.py", line 1264, in controlnet_hack self.controlnet_main_entry(p) File "F:\sd\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\scripts\controlnet.py", line 1029, in controlnet_main_entry controls, hr_controls = list(zip([preprocess_input_image(img) for img in optional_tqdm(input_images)])) File "F:\sd\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\scripts\controlnet.py", line 1029, in controls, hr_controls = list(zip([preprocess_input_image(img) for img in optional_tqdm(input_images)])) File "F:\sd\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\scripts\controlnet.py", line 986, in preprocess_input_image detected_map, is_image = self.preprocessor[unit.module]( File "F:\sd\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\scripts\utils.py", line 81, in decorated_func return cached_func(*args, *kwargs) File "F:\sd\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\scripts\utils.py", line 65, in cached_func return func(args, kwargs) File "F:\sd\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\scripts\global_state.py", line 37, in unified_preprocessor return preprocessor_modules[preprocessor_name](*args, kwargs) File "F:\sd\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\scripts\processor.py", line 861, in run_model depth_map, mask, info = self.model( File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\hand_refiner__init.py", line 29, in call__ depth_map, mask, info = self.pipeline.get_depth(input_image, mask_bbox_padding) File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\hand_refiner\pipeline.py", line 365, in get_depth cropped_depthmap, pred_2d_keypoints = self.run_inference(graphormer_input.astype(np.uint8), self._model, self.mano_model, self.mesh_sampler, scale, int(crop_len)) File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\hand_refiner\pipeline.py", line 239, in run_inference pred_camera, pred_3d_joints, pred_vertices_sub, pred_vertices, hidden_states, att = Graphormer_model(batch_imgs, mano, mesh_sampler) File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl return forward_call(args, kwargs) File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\mesh_graphormer\modeling\bert\e2e_hand_network.py", line 37, in forward template_vertices_sub = mesh_sampler.downsample(template_vertices) File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\mesh_graphormer\modeling_mano.py", line 141, in downsample y = spmm(self._D[j], y, self.device) File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\mesh_graphormer\modeling\util.py", line 26, in spmm return SparseMM.apply(sparse, dense) File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\torch\autograd\function.py", line 553, in apply return super().apply(*args, **kwargs) # type: ignore[misc] File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\mesh_graphormer\modeling\util.py", line 11, in forward return torch.matmul(sparse, dense) RuntimeError: CUDA error: when calling cusparseCreateDnMat( &descB, kb, nb, ldb, b, cusparse_value_type, CUSPARSE_ORDER_COL )


100%|█████████████████████████████████████████████| 16/16 [00:10<00:00, 1.52it/s] Total progress: 100%|█████████████████████████████| 16/16 [00:10<00:00, 1.46it/s] Total progress: 100%|█████████████████████████████| 16/16 [00:10<00:00, 1.53it/s]

Kerorowong commented 2 months ago

Zluda no longer work now. What happen? I get this message "ZLUDA device failed to pass basic operation test: index=None, device_name=AMD Radeon RX 6600 XT[ZLUDA]" The installation method which I use previously no longer work either. 1.8.0-RC commit hash 25a3b6cbeea8a07afd5e4594afc2f1c79f41ac1a has became 1.7.0

lshqqytiger commented 2 months ago

You will get an error while running these commands:

.\venv\Scripts\activate
python
import torch
ten1 = torch.randn((2, 4,), device="cuda")
ten2 = torch.randn((4, 8,), device="cuda")
torch.mm(ten1, ten2)

Let me know what it is.

Kerorowong commented 2 months ago

C:\Users\Intel>conda activate test

(test) C:\Users\Intel>python Python 3.10.6 | packaged by conda-forge | (main, Oct 24 2022, 16:02:16) [MSC v.1916 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information.

import torch Traceback (most recent call last): File "", line 1, in ModuleNotFoundError: No module named 'torch' ten1 = torch.randn((2, 4,), device="cuda") Traceback (most recent call last): File "", line 1, in NameError: name 'torch' is not defined ten2 = torch.randn((4, 8,), device="cuda") Traceback (most recent call last): File "", line 1, in NameError: name 'torch' is not defined torch.mm(ten1, ten2)

lshqqytiger commented 2 months ago

Did you run it using the same virtual environment with webui?

Kerorowong commented 2 months ago

After it stop working after I git pull, I have to reinstall it. However, it keep giving me ZLUDA device failed to pass basic operation test: index=None, device_name=AMD Radeon RX 6600 XT[ZLUDA] error. This venv is the one I success install with version 1.6.1 using directml

Yes, I'm using miniconda to create virtual environment in version 1.6.1 conda create --name test python=3.10.6 conda activate test

For testing your code, I use conda activate test python import torch ten1 = torch.randn((2, 4,), device="cuda") ten2 = torch.randn((4, 8,), device="cuda") torch.mm(ten1, ten2)

and the result is what you seen above

Kerorowong commented 2 months ago

I redo the test using conda and venv in version 1.6.1. Hope this is the info you need

Microsoft Windows [Version 10.0.22631.3447] (c) Microsoft Corporation. All rights reserved.

F:\sd\stable-diffusion-webui-directml>conda activate test

(test) F:\sd\stable-diffusion-webui-directml>venv\Scripts\activate

(venv) (test) F:\sd\stable-diffusion-webui-directml>python Python 3.10.6 | packaged by conda-forge | (main, Oct 24 2022, 16:02:16) [MSC v.1916 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information.

import torch ten1 = torch.randn((2, 4,), device="cuda") Traceback (most recent call last): File "", line 1, in File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\torch\cuda__init.py", line 239, in _lazy_init raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled ten2 = torch.randn((4, 8,), device="cuda") Traceback (most recent call last): File "", line 1, in File "F:\sd\stable-diffusion-webui-directml\venv\lib\site-packages\torch\cuda\init__.py", line 239, in _lazy_init raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled torch.mm(ten1, ten2)

lshqqytiger commented 2 months ago

Don't activate two or more virtual environment at the same terminal.

If you want to use conda,

  1. Open anaconda terminal with base environment activated.
  2. Create environment. (conda create ...)
  3. Activate environment. (conda activate ...)
  4. Launch webui. (python launch.py --use-zluda ...)
  5. If ZLUDA device failed to pass basic operation test shows, run
    python
    import torch
    ten1 = torch.randn((2, 4,), device="cuda")
    ten2 = torch.randn((4, 8,), device="cuda")
    torch.mm(ten1, ten2)
Kerorowong commented 2 months ago

This is what I get

F:\sd\sd_zluda>conda activate tiger

(tiger) F:\sd\sd_zluda>v1.9.0-1-ge51a8494 'v1.9.0-1-ge51a8494' is not recognized as an internal or external command, operable program or batch file.

(tiger) F:\sd\sd_zluda>python launch.py --use-zluda WARNING: ZLUDA works best with SD.Next. Please consider migrating to SD.Next. Using ZLUDA in D:\PATH\zluda Python 3.10.6 | packaged by conda-forge | (main, Oct 24 2022, 16:02:16) [MSC v.1916 64 bit (AMD64)] Version: v1.9.0-1-ge51a8494 Commit hash: e51a8494691acfbfa18d1989fa4b810a15bf10b1 Traceback (most recent call last): File "F:\sd\sd_zluda\launch.py", line 48, in main() File "F:\sd\sd_zluda\launch.py", line 39, in main prepare_environment() File "F:\sd\sd_zluda\modules\launch_utils.py", line 593, in prepare_environment raise RuntimeError( RuntimeError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check

(tiger) F:\sd\sd_zluda>python Python 3.10.6 | packaged by conda-forge | (main, Oct 24 2022, 16:02:16) [MSC v.1916 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information.

import torch ten1 = torch.randn((2, 4,), device="cuda") Traceback (most recent call last): File "", line 1, in File "D:\Programs\miniconda3\envs\tiger\lib\site-packages\torch\cuda__init.py", line 239, in _lazy_init raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled ten2 = torch.randn((4, 8,), device="cuda") Traceback (most recent call last): File "", line 1, in File "D:\Programs\miniconda3\envs\tiger\lib\site-packages\torch\cuda\init__.py", line 239, in _lazy_init raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled torch.mm(ten1, ten2)

lshqqytiger commented 2 months ago

There should be torch 2.2.2+cu118. Try this:

pip uninstall torch torchvision -y
python launch.py --use-zluda
Kerorowong commented 2 months ago

It works now, but I don't understand why it doesn't work previously. Is it because of torch 2.2.2+cu118?

lshqqytiger commented 2 months ago

You had torch built with cpu previously.

Kerorowong commented 2 months ago

ok, thanks a lot!