Closed TEM-112 closed 2 months ago
HIP SDK has a bug with RX 500 cards. (pre-navi)
also HIP SDK won't work if you have older CPU, using DirectML instead should work
also HIP SDK won't work if you have older CPU, using DirectML instead should work
ok. but how do I install and use DirectML for stable diffusion?
HIP SDK has a bug with RX 500 cards. (pre-navi)
Can I solve this somehow, e.g. use a different hip sdk version or is this a fundamental problem. And are there other solutions to get sd forge to work with my rx 570
also HIP SDK won't work if you have older CPU, using DirectML instead should work
ok. but how do I install and use DirectML for stable diffusion?
Replace --use-zluda
with --directml
.
it works, thanks
I just installed sd forge and when I want to generate something, it just says: TypeError: 'NoneType' object is not iterable
venv "D:\MAmes\Tools\Stabile Defusion\stable-diffusion-webui-amdgpu-forge\venv\Scripts\Python.exe" ROCm Toolkit 5.7 was found. fatal: No names found, cannot describe anything. Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug 1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)] Version: f2.0.1v1.10.1-1.10.1 Commit hash: 976a7bf0a135f325a192af8374117bbe8d99de1b Using ZLUDA in D:\MAmes\Tools\Stabile Defusion\stable-diffusion-webui-amdgpu-forge.zluda Total VRAM 8192 MB, total RAM 32695 MB pytorch version: 2.3.0+cu118 Set vram state to: NORMAL_VRAM Device: cuda:0 Radeon RX 570 Series [ZLUDA] : native VAE dtype preferences: [torch.bfloat16, torch.float32] -> torch.bfloat16 Launching Web UI with arguments: CUDA Using Stream: False Using pytorch cross attention Using pytorch attention for VAE D:\MAmes\Tools\Stabile Defusion\stable-diffusion-webui-amdgpu-forge\venv\lib\site-packages\diffusers\models\vq_model.py:20: FutureWarning:
VQEncoderOutput
is deprecated and will be removed in version 0.31. ImportingVQEncoderOutput
fromdiffusers.models.vq_model
is deprecated and this will be removed in a future version. Please usefrom diffusers.models.autoencoders.vq_model import VQEncoderOutput
, instead. deprecate("VQEncoderOutput", "0.31", deprecation_message) D:\MAmes\Tools\Stabile Defusion\stable-diffusion-webui-amdgpu-forge\venv\lib\site-packages\diffusers\models\vq_model.py:25: FutureWarning:VQModel
is deprecated and will be removed in version 0.31. ImportingVQModel
fromdiffusers.models.vq_model
is deprecated and this will be removed in a future version. Please usefrom diffusers.models.autoencoders.vq_model import VQModel
, instead. deprecate("VQModel", "0.31", deprecation_message) ONNX: version=1.18.1 provider=CPUExecutionProvider, available=['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider'] ZLUDA device failed to pass basic operation test: index=0, device_name=Radeon RX 570 Series [ZLUDA] CUDA error: out of memory CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile withTORCH_USE_CUDA_DSA
to enable device-side assertions.============================================================================== You are running torch 2.3.0+cu118. The program is tested to work with torch 2.3.1. To reinstall the desired version, run with commandline flag --reinstall-torch. Beware that this will cause a lot of large files to be downloaded, as well as there are reports of issues with training tab on the latest version.
Use --skip-version-check commandline argument to disable this check.
ControlNet preprocessor location: D:\MAmes\Tools\Stabile Defusion\stable-diffusion-webui-amdgpu-forge\models\ControlNetPreprocessor 2024-08-14 04:06:28,668 - ControlNet - INFO - ControlNet UI callback registered. Model selected: {'checkpoint_info': {'filename': 'D:\MAmes\Tools\Stabile Defusion\stable-diffusion-webui-amdgpu-forge\models\Stable-diffusion\realisticVisionV51_v51VAE.safetensors', 'hash': 'a0f13c83'}, 'vae_filename': None, 'unet_storage_dtype': None} Running on local URL: http://127.0.0.1:7860
To create a public link, set
share=True
inlaunch()
. Startup time: 26.8s (prepare environment: 7.9s, launcher: 3.3s, import torch: 4.0s, initialize shared: 6.6s, load scripts: 3.1s, create ui: 3.2s, gradio launch: 1.8s). Environment vars changed: {'stream': False, 'inference_memory': 1024.0, 'pin_shared_memory': False} Exception in thread MemMon: Traceback (most recent call last): File "C:\Users\TS\AppData\Local\Programs\Python\Python310\lib\threading.py", line 1016, in _bootstrap_inner self.run() File "D:\MAmes\Tools\Stabile Defusion\stable-diffusion-webui-amdgpu-forge\modules\memmon.py", line 43, in run torch.cuda.reset_peak_memory_stats() File "D:\MAmes\Tools\Stabile Defusion\stable-diffusion-webui-amdgpu-forge\venv\lib\site-packages\torch\cuda\memory.py", line 309, in reset_peak_memory_stats return torch._C._cuda_resetPeakMemoryStats(device) RuntimeError: invalid argument to reset_peak_memory_stats Loading Model: {'checkpoint_info': {'filename': 'D:\MAmes\Tools\Stabile Defusion\stable-diffusion-webui-amdgpu-forge\models\Stable-diffusion\realisticVisionV51_v51VAE.safetensors', 'hash': 'a0f13c83'}, 'vae_filename': None, 'unet_storage_dtype': None} StateDict Keys: {'unet': 686, 'vae': 248, 'text_encoder': 197, 'ignore': 0} D:\MAmes\Tools\Stabile Defusion\stable-diffusion-webui-amdgpu-forge\venv\lib\site-packages\transformers\tokenization_utils_base.py:1601: FutureWarning:clean_up_tokenization_spaces
was not set. It will be set toTrue
by default. This behavior will be depracted in transformers v4.45, and will be then set toFalse
by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884 warnings.warn( Working with z of shape (1, 4, 32, 32) = 4096 dimensions. K-Model Created: {'storage_dtype': torch.float16, 'computation_dtype': torch.float16} Model loaded in 1.0s (unload existing model: 0.2s, forge model load: 0.8s). To load target model ModuleDict Begin to load 1 model [Memory Management] Current Free GPU Memory: 7331.17 MB [Memory Management] Required Model Memory: 234.72 MB [Memory Management] Required Inference Memory: 1024.00 MB [Memory Management] Estimated Remaining GPU Memory: 6072.45 MB Moving model(s) has taken 0.15 seconds Traceback (most recent call last): File "D:\MAmes\Tools\Stabile Defusion\stable-diffusion-webui-amdgpu-forge\modules_forge\main_thread.py", line 37, in loop task.work() File "D:\MAmes\Tools\Stabile Defusion\stable-diffusion-webui-amdgpu-forge\modules_forge\main_thread.py", line 26, in work self.result = self.func(*self.args, *self.kwargs) File "D:\MAmes\Tools\Stabile Defusion\stable-diffusion-webui-amdgpu-forge\modules\txt2img.py", line 110, in txt2img_function processed = processing.process_images(p) File "D:\MAmes\Tools\Stabile Defusion\stable-diffusion-webui-amdgpu-forge\modules\processing.py", line 800, in process_images res = process_images_inner(p) File "D:\MAmes\Tools\Stabile Defusion\stable-diffusion-webui-amdgpu-forge\modules\processing.py", line 1005, in process_images_inner p.setup_conds() File "D:\MAmes\Tools\Stabile Defusion\stable-diffusion-webui-amdgpu-forge\modules\processing.py", line 1590, in setup_conds super().setup_conds() File "D:\MAmes\Tools\Stabile Defusion\stable-diffusion-webui-amdgpu-forge\modules\processing.py", line 493, in setup_conds self.uc = self.get_conds_with_caching(prompt_parser.get_learned_conditioning, negative_prompts, total_steps, [self.cached_uc], self.extra_network_data) File "D:\MAmes\Tools\Stabile Defusion\stable-diffusion-webui-amdgpu-forge\modules\processing.py", line 464, in get_conds_with_caching cache[1] = function(shared.sd_model, required_prompts, steps, hires_steps, shared.opts.use_old_scheduling) File "D:\MAmes\Tools\Stabile Defusion\stable-diffusion-webui-amdgpu-forge\modules\prompt_parser.py", line 189, in get_learned_conditioning conds = model.get_learned_conditioning(texts) File "D:\MAmes\Tools\Stabile Defusion\stable-diffusion-webui-amdgpu-forge\venv\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(args, **kwargs) File "D:\MAmes\Tools\Stabile Defusion\stable-diffusion-webui-amdgpu-forge\backend\diffusion_engine\sd15.py", line 63, in get_learned_conditioning cond = self.text_processing_engine(prompt) File "D:\MAmes\Tools\Stabile Defusion\stable-diffusion-webui-amdgpu-forge\backend\text_processing\classic_engine.py", line 268, in call z = self.process_tokens(tokens, multipliers) File "D:\MAmes\Tools\Stabile Defusion\stable-diffusion-webui-amdgpu-forge\backend\text_processing\classic_engine.py", line 301, in process_tokens z = self.encode_with_transformers(tokens) File "D:\MAmes\Tools\Stabile Defusion\stable-diffusion-webui-amdgpu-forge\backend\text_processing\classic_engine.py", line 126, in encode_with_transformers self.text_encoder.transformer.text_model.embeddings.position_embedding = self.text_encoder.transformer.text_model.embeddings.position_embedding.to(dtype=torch.float32) File "D:\MAmes\Tools\Stabile Defusion\stable-diffusion-webui-amdgpu-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1173, in to return self._apply(convert) File "D:\MAmes\Tools\Stabile Defusion\stable-diffusion-webui-amdgpu-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 804, in _apply param_applied = fn(param) File "D:\MAmes\Tools\Stabile Defusion\stable-diffusion-webui-amdgpu-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1159, in convert return t.to( RuntimeError: CUDA error: out of memory CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile withTORCH_USE_CUDA_DSA
to enable device-side assertions.CUDA error: out of memory CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with
TORCH_USE_CUDA_DSA
to enable device-side assertions.Error completing request Arguments: ('task(jpqpqwo13p9t0b8)', <gradio.route_utils.Request object at 0x000002254C2AC2B0>, 'blabla', 'nanan', ['photo-alien'], 1, 1, 5, 3.5, 1152, 896, False, 0.7, 2, 'Latent', 0, 0, 0, 'Use same checkpoint', 'Use same sampler', 'Use same scheduler', '', '', None, 0, 20, 'DPM++ 2M SDE', 'Karras', False, -1, False, -1, 0, 0, 0, ControlNetUnit(input_mode=<InputMode.SIMPLE: 'simple'>, use_preview_as_input=False, batch_image_dir='', batch_mask_dir='', batch_input_gallery=None, batch_mask_gallery=None, generated_image=None, mask_image=None, mask_image_fg=None, hr_option='Both', enabled=False, module='None', model='None', weight=1, image=None, image_fg=None, resize_mode='Crop and Resize', processor_res=-1, threshold_a=-1, threshold_b=-1, guidance_start=0.0, guidance_end=1.0, pixel_perfect=False, control_mode='Balanced', save_detected_map=True), ControlNetUnit(input_mode=<InputMode.SIMPLE: 'simple'>, use_preview_as_input=False, batch_image_dir='', batch_mask_dir='', batch_input_gallery=None, batch_mask_gallery=None, generated_image=None, mask_image=None, mask_image_fg=None, hr_option='Both', enabled=False, module='None', model='None', weight=1, image=None, image_fg=None, resize_mode='Crop and Resize', processor_res=-1, threshold_a=-1, threshold_b=-1, guidance_start=0.0, guidance_end=1.0, pixel_perfect=False, control_mode='Balanced', save_detected_map=True), ControlNetUnit(input_mode=<InputMode.SIMPLE: 'simple'>, use_preview_as_input=False, batch_image_dir='', batch_mask_dir='', batch_input_gallery=None, batch_mask_gallery=None, generated_image=None, mask_image=None, mask_image_fg=None, hr_option='Both', enabled=False, module='None', model='None', weight=1, image=None, image_fg=None, resize_mode='Crop and Resize', processor_res=-1, threshold_a=-1, threshold_b=-1, guidance_start=0.0, guidance_end=1.0, pixel_perfect=False, control_mode='Balanced', save_detected_map=True), False, 7, 1, 'Constant', 0, 'Constant', 0, 1, 'enable', 'MEAN', 'AD', 1, False, 1.01, 1.02, 0.99, 0.95, False, 0.5, 2, False, 3, False, 3, 2, 0, 0.35, True, 'bicubic', 'bicubic', False, 0, 'anisotropic', 0, 'reinhard', 100, 0, 'subtract', 0, 0, 'gaussian', 'add', 0, 100, 127, 0, 'hard_clamp', 5, 0, 'None', 'None', False, 'MultiDiffusion', 768, 768, 64, 4, False, False, False, False, False, 'positive', 'comma', 0, False, False, 'start', '', 1, '', '', 0, '', '', 0, '', '', True, False, False, False, False, False, False, 0, False) {} Traceback (most recent call last): File "D:\MAmes\Tools\Stabile Defusion\stable-diffusion-webui-amdgpu-forge\modules\call_queue.py", line 74, in f res = list(func(*args, **kwargs)) TypeError: 'NoneType' object is not iterable