[Issue]: DirectML with Simple preview not functioning

ThePixelDiffusionPirate commented 3 months ago

Issue Description

I switched from the original to the Diffusers backend last week. Since I have an AMD 5700XT, I can't use many features anyway due to the hardware. After taking several days to achieve a similar quality result using the Diffusers backend, unfortunately nothing has worked since the last update. I set up a completely fresh installation and only made a few minimal changes to the standard. Now when I want to generate something with a simple test prompt, it freezes after 6%. No error message, nothing. I can't save anything after that either, for example in the settings.

Version Platform Description

[a1f53ad] SDNext Dev - Modern-UI clientlog.txt config.json serverlog.txt

Win11 23H2 latest build Firefox 127 AMD RX 5700 XT latest driver

Relevant log output

2024-06-17 08:20:27,518 | sd | INFO | launch | Starting SD.Next
2024-06-17 08:20:27,522 | sd | INFO | installer | Logger: file="C:\Users\TestMachine\AndreDiffusion\NextTest\automatic\sdnext.log" level=DEBUG size=65 mode=create
2024-06-17 08:20:27,524 | sd | INFO | installer | Python version=3.10.11 platform=Windows bin="C:\Users\TestMachine\AndreDiffusion\NextTest\automatic\venv\Scripts\python.exe" venv="C:\Users\TestMachine\AndreDiffusion\NextTest\automatic\venv"
2024-06-17 08:20:27,681 | sd | INFO | installer | Version: app=sd.next updated=2024-06-16 hash=a1f53add branch=dev url=https://github.com/vladmandic/automatic/tree/dev ui=dev
2024-06-17 08:20:28,029 | sd | INFO | installer | Updating main repository
2024-06-17 08:20:28,844 | sd | INFO | installer | Upgraded to version: a1f53add Sun Jun 16 17:00:35 2024 -0400
2024-06-17 08:20:28,850 | sd | INFO | launch | Platform: arch=AMD64 cpu=AMD64 Family 23 Model 113 Stepping 0, AuthenticAMD system=Windows release=Windows-10-10.0.22631-SP0 python=3.10.11
2024-06-17 08:20:28,852 | sd | DEBUG | installer | Setting environment tuning
2024-06-17 08:20:28,853 | sd | INFO | installer | HF cache folder: C:\Users\TestMachine\.cache\huggingface\hub
2024-06-17 08:20:28,854 | sd | DEBUG | installer | Torch allocator: "garbage_collection_threshold:0.80,max_split_size_mb:512"
2024-06-17 08:20:28,855 | sd | DEBUG | installer | Torch overrides: cuda=False rocm=False ipex=False diml=True openvino=False
2024-06-17 08:20:28,856 | sd | DEBUG | installer | Torch allowed: cuda=False rocm=False ipex=False diml=True openvino=False
2024-06-17 08:20:28,857 | sd | INFO | installer | Using DirectML Backend
2024-06-17 08:20:28,984 | sd | INFO | installer | Verifying requirements
2024-06-17 08:20:28,988 | sd | INFO | installer | Verifying packages
2024-06-17 08:20:28,990 | sd | INFO | launch | Startup: standard
2024-06-17 08:20:28,990 | sd | INFO | installer | Verifying submodules
2024-06-17 08:20:30,816 | sd | DEBUG | installer | Submodule: extensions-builtin/sd-extension-chainner / main
2024-06-17 08:20:31,432 | sd | DEBUG | installer | Submodule: extensions-builtin/sd-extension-system-info / main
2024-06-17 08:20:32,053 | sd | DEBUG | installer | Submodule: extensions-builtin/sd-webui-agent-scheduler / main
2024-06-17 08:20:32,692 | sd | DEBUG | installer | Submodule: extensions-builtin/sdnext-modernui / dev
2024-06-17 08:20:33,323 | sd | DEBUG | installer | Submodule: extensions-builtin/stable-diffusion-webui-rembg / master
2024-06-17 08:20:33,942 | sd | DEBUG | installer | Submodule: modules/k-diffusion / master
2024-06-17 08:20:34,537 | sd | DEBUG | installer | Submodule: wiki / master
2024-06-17 08:20:35,150 | sd | DEBUG | paths | Register paths
2024-06-17 08:20:35,219 | sd | DEBUG | installer | Installed packages: 186
2024-06-17 08:20:35,221 | sd | DEBUG | installer | Extensions all: ['Lora', 'sd-extension-chainner', 'sd-extension-system-info', 'sd-webui-agent-scheduler', 'sdnext-modernui', 'stable-diffusion-webui-rembg']
2024-06-17 08:20:35,855 | sd | DEBUG | installer | Submodule: extensions-builtin\sd-extension-chainner / main
2024-06-17 08:20:36,524 | sd | DEBUG | installer | Submodule: extensions-builtin\sd-extension-system-info / main
2024-06-17 08:20:37,188 | sd | DEBUG | installer | Submodule: extensions-builtin\sd-webui-agent-scheduler / main
2024-06-17 08:20:37,775 | sd | DEBUG | installer | Running extension installer: C:\Users\TestMachine\AndreDiffusion\NextTest\automatic\extensions-builtin\sd-webui-agent-scheduler\install.py
2024-06-17 08:20:38,092 | sd | DEBUG | installer | Submodule: extensions-builtin\sdnext-modernui / dev
2024-06-17 08:20:38,741 | sd | DEBUG | installer | Submodule: extensions-builtin\stable-diffusion-webui-rembg / master
2024-06-17 08:20:39,336 | sd | DEBUG | installer | Running extension installer: C:\Users\TestMachine\AndreDiffusion\NextTest\automatic\extensions-builtin\stable-diffusion-webui-rembg\install.py
2024-06-17 08:20:39,620 | sd | DEBUG | installer | Extensions all: []
2024-06-17 08:20:39,621 | sd | INFO | installer | Extensions enabled: ['Lora', 'sd-extension-chainner', 'sd-extension-system-info', 'sd-webui-agent-scheduler', 'sdnext-modernui', 'stable-diffusion-webui-rembg']
2024-06-17 08:20:39,623 | sd | INFO | installer | Verifying requirements
2024-06-17 08:20:39,624 | sd | INFO | installer | Updating Wiki
2024-06-17 08:20:39,652 | sd | DEBUG | installer | Submodule: C:\Users\TestMachine\AndreDiffusion\NextTest\automatic\wiki / master
2024-06-17 08:20:40,238 | sd | DEBUG | launch | Setup complete without errors: 1718605240
2024-06-17 08:20:40,242 | sd | DEBUG | installer | Extension preload: {'extensions-builtin': 0.0, 'extensions': 0.0}
2024-06-17 08:20:40,243 | sd | DEBUG | launch | Starting module: <module 'webui' from 'C:\\Users\\TestMachine\\AndreDiffusion\\NextTest\\automatic\\webui.py'>
2024-06-17 08:20:40,244 | sd | INFO | launch | Command line args: ['--debug', '--upgrade', '--use-directml', '--medvram'] medvram=True upgrade=True use_directml=True debug=True
2024-06-17 08:20:40,246 | sd | DEBUG | launch | Env flags: []
2024-06-17 08:20:44,936 | sd | INFO | loader | Load packages: {'torch': '2.3.1+cpu', 'diffusers': '0.29.0', 'gradio': '3.43.2'}
2024-06-17 08:20:45,497 | sd | DEBUG | shared | Read: file="config.json" json=60 bytes=2424 time=0.000
2024-06-17 08:20:45,499 | sd | DEBUG | shared | Unknown settings: ['queue_paused', 'queue_history_retention_days', 'queue_keyboard_shortcut']
2024-06-17 08:20:45,528 | sd | INFO | shared | Engine: backend=Backend.DIFFUSERS compute=directml device=privateuseone:0 attention="Dynamic Attention BMM" mode=no_grad
2024-06-17 08:20:45,581 | sd | INFO | shared | Device: device=AMD Radeon RX 5700 XT n=1 directml=0.2.2.dev240614
2024-06-17 08:20:45,582 | sd | DEBUG | shared | Read: file="html\reference.json" json=41 bytes=23838 time=0.000
2024-06-17 08:20:45,866 | sd | DEBUG | __init__ | ONNX: version=1.18.0 provider=DmlExecutionProvider, available=['AzureExecutionProvider', 'CPUExecutionProvider']
2024-06-17 08:20:45,948 | sd | DEBUG | sd_hijack | Importing LDM
2024-06-17 08:20:45,963 | sd | DEBUG | webui | Entering start sequence
2024-06-17 08:20:45,965 | sd | DEBUG | webui | Initializing
2024-06-17 08:20:45,977 | sd | INFO | sd_vae | Available VAEs: path="models\VAE" items=1
2024-06-17 08:20:45,978 | sd | DEBUG | sd_unet | Available UNets: path="models\UNET" items=0
2024-06-17 08:20:45,980 | sd | INFO | extensions | Disabled extensions: []
2024-06-17 08:20:45,981 | sd | DEBUG | shared | Read: file="cache.json" json=1 bytes=350 time=0.000
2024-06-17 08:20:45,983 | sd | DEBUG | shared | Read: file="metadata.json" json=2 bytes=3071 time=0.000
2024-06-17 08:20:45,985 | sd | DEBUG | modelloader | Scanning diffusers cache: folder=models\Diffusers items=0 time=0.00
2024-06-17 08:20:45,985 | sd | INFO | sd_models | Available models: path="models\Stable-diffusion" items=1 time=0.00
2024-06-17 08:20:46,179 | sd | DEBUG | webui | Load extensions
2024-06-17 08:20:46,218 | sd | INFO | networks | LoRA networks: available=0 folders=2
2024-06-17 08:20:46,221 | sd | INFO | script_loading | Extension: script='extensions-builtin\Lora\scripts\lora_script.py' [2;36m08:20:46-218274[0m[2;36m [0m[34mINFO    [0m LoRA networks: [33mavailable[0m=[1;36m0[0m [33mfolders[0m=[1;36m2[0m
2024-06-17 08:20:46,677 | sd | INFO | script_loading | Extension: script='extensions-builtin\sd-webui-agent-scheduler\scripts\task_scheduler.py' Using sqlite file: extensions-builtin\sd-webui-agent-scheduler\task_scheduler.sqlite3
2024-06-17 08:20:46,690 | sd | DEBUG | webui | Extensions init time: 0.51 sd-webui-agent-scheduler=0.43
2024-06-17 08:20:46,699 | sd | DEBUG | shared | Read: file="html/upscalers.json" json=4 bytes=2672 time=0.000
2024-06-17 08:20:46,701 | sd | DEBUG | shared | Read: file="extensions-builtin\sd-extension-chainner\models.json" json=24 bytes=2719 time=0.000
2024-06-17 08:20:46,702 | sd | DEBUG | chainner_model | chaiNNer models: path="models\chaiNNer" defined=24 discovered=0 downloaded=0
2024-06-17 08:20:46,705 | sd | DEBUG | modelloader | Load upscalers: total=52 downloaded=0 user=0 time=0.01 ['None', 'Lanczos', 'Nearest', 'ChaiNNer', 'ESRGAN', 'LDSR', 'RealESRGAN', 'SCUNet', 'SD', 'SwinIR']
2024-06-17 08:20:46,707 | sd | DEBUG | styles | Load styles: folder="models\styles" items=0 time=0.00
2024-06-17 08:20:46,710 | sd | DEBUG | webui | Creating UI
2024-06-17 08:20:46,711 | sd | DEBUG | theme | UI themes available: type=Modern themes=31
2024-06-17 08:20:46,712 | sd | INFO | theme | UI theme: type=Modern name="Eoan-Moonlight"
2024-06-17 08:20:46,718 | sd | DEBUG | ui_javascript | UI theme: css="extensions-builtin\sdnext-modernui\themes\Eoan-Moonlight.css" base="base.css" user="None"
2024-06-17 08:20:46,721 | sd | DEBUG | ui_txt2img | UI initialize: txt2img
2024-06-17 08:20:46,743 | sd | DEBUG | ui_extra_networks | Extra networks: page='model' items=41 subfolders=2 tab=txt2img folders=['models\\Stable-diffusion', 'models\\Diffusers', 'models\\Reference'] list=0.01 thumb=0.00 desc=0.00 info=0.00 workers=4 sort=Default
2024-06-17 08:20:46,746 | sd | DEBUG | ui_extra_networks | Extra networks: page='lora' items=0 subfolders=0 tab=txt2img folders=['models\\Lora', 'models\\LyCORIS'] list=0.00 thumb=0.00 desc=0.00 info=0.00 workers=4 sort=Default
2024-06-17 08:20:46,749 | sd | DEBUG | ui_extra_networks | Extra networks: page='style' items=0 subfolders=0 tab=txt2img folders=['models\\styles', 'html'] list=0.00 thumb=0.00 desc=0.00 info=0.00 workers=4 sort=Default
2024-06-17 08:20:46,750 | sd | DEBUG | ui_extra_networks | Extra networks: page='embedding' items=0 subfolders=0 tab=txt2img folders=['models\\embeddings'] list=0.00 thumb=0.00 desc=0.00 info=0.00 workers=4 sort=Default
2024-06-17 08:20:46,752 | sd | DEBUG | ui_extra_networks | Extra networks: page='vae' items=1 subfolders=0 tab=txt2img folders=['models\\VAE'] list=0.00 thumb=0.00 desc=0.00 info=0.00 workers=4 sort=Default
2024-06-17 08:20:46,837 | sd | DEBUG | ui_img2img | UI initialize: img2img
2024-06-17 08:20:46,948 | sd | DEBUG | ui_control_helpers | UI initialize: control models=models\control
2024-06-17 08:20:47,555 | sd | DEBUG | shared | Read: file="ui-config.json" json=0 bytes=2 time=0.000
2024-06-17 08:20:47,668 | sd | DEBUG | theme | UI themes available: type=Modern themes=31
2024-06-17 08:20:48,265 | sd | DEBUG | ui_extensions | Extension list: processed=366 installed=6 enabled=6 disabled=0 visible=366 hidden=0
2024-06-17 08:20:48,550 | sd | DEBUG | webui | Root paths: ['C:\\Users\\TestMachine\\AndreDiffusion\\NextTest\\automatic']
2024-06-17 08:20:48,622 | sd | INFO | webui | Local URL: http://127.0.0.1:7860/
2024-06-17 08:20:48,623 | sd | DEBUG | webui | Gradio functions: registered=1737
2024-06-17 08:20:48,626 | sd | DEBUG | middleware | FastAPI middleware: ['Middleware', 'Middleware']
2024-06-17 08:20:48,629 | sd | DEBUG | webui | Creating API
2024-06-17 08:20:48,797 | sd | INFO | task_runner | [AgentScheduler] Runner is paused
2024-06-17 08:20:48,798 | sd | INFO | api | [AgentScheduler] Registering APIs
2024-06-17 08:20:48,906 | sd | DEBUG | webui | Scripts setup: ['IP Adapters:0.025', 'AnimateDiff:0.012', 'X/Y/Z Grid:0.012', 'Face:0.015', 'Image-to-Video:0.007', 'Stable Video Diffusion:0.006']
2024-06-17 08:20:48,907 | sd | DEBUG | sd_models | Model metadata: file="metadata.json" no changes
2024-06-17 08:20:48,913 | sd | INFO | devices | Torch override VAE dtype: no-half set
2024-06-17 08:20:48,915 | sd | DEBUG | devices | Desired Torch parameters: dtype=FP16 no-half=False no-half-vae=True upscast=False
2024-06-17 08:20:48,916 | sd | INFO | devices | Setting Torch parameters: device=privateuseone:0 dtype=torch.float16 vae=torch.float32 unet=torch.float16 context=no_grad fp16=True bf16=None optimization=Dynamic Attention BMM
2024-06-17 08:20:48,917 | sd | DEBUG | modeldata | Model requested: fn=<lambda>
2024-06-17 08:20:48,918 | sd | INFO | sd_models | Select: model="AYUBombastic [3928eee5b8]"
2024-06-17 08:20:48,920 | sd | DEBUG | sd_models | Load model: existing=False target=C:\Users\TestMachine\AndreDiffusion\NextTest\automatic\models\Stable-diffusion\AYUBombastic.safetensors info=None
2024-06-17 08:20:48,921 | sd | DEBUG | sd_models | Diffusers loading: path="C:\Users\TestMachine\AndreDiffusion\NextTest\automatic\models\Stable-diffusion\AYUBombastic.safetensors"
2024-06-17 08:20:48,922 | sd | INFO | sd_models | Autodetect: model="Stable Diffusion" class=StableDiffusionPipeline file="C:\Users\TestMachine\AndreDiffusion\NextTest\automatic\models\Stable-diffusion\AYUBombastic.safetensors" size=2040MB
2024-06-17 08:20:49,810 | sd | DEBUG | sd_models | Setting model: pipeline=StableDiffusionPipeline config={'low_cpu_mem_usage': True, 'torch_dtype': torch.float16, 'load_connected_pipeline': True, 'extract_ema': False, 'config': 'configs/sd15', 'use_safetensors': True, 'cache_dir': 'C:\\Users\\TestMachine\\.cache\\huggingface\\hub'}
2024-06-17 08:20:49,815 | sd | INFO | textual_inversion | Load embeddings: loaded=0 skipped=0 time=0.00
2024-06-17 08:20:49,884 | sd | DEBUG | sd_models | Setting model VAE: upcast=False
2024-06-17 08:20:49,885 | sd | DEBUG | sd_models | Setting model: enable VAE slicing
2024-06-17 08:20:49,886 | sd | DEBUG | sd_models | Setting model: enable VAE tiling
2024-06-17 08:20:49,901 | sd | DEBUG | sd_models | Setting model: enable model CPU offload
2024-06-17 08:20:50,118 | sd | DEBUG | devices | GC: collected=16 device=privateuseone:0 {'ram': {'used': 1.13, 'total': 31.93}, 'gpu': {'used': 0.55, 'total': 7.98}, 'retries': 0, 'oom': 0} time=0.19
2024-06-17 08:20:50,124 | sd | INFO | sd_models | Load model: time=1.01 load=0.89 move=0.11 native=512 {'ram': {'used': 1.13, 'total': 31.93}, 'gpu': {'used': 0.55, 'total': 7.98}, 'retries': 0, 'oom': 0}
2024-06-17 08:20:50,126 | sd | DEBUG | script_callbacks | Script callback init time: system-info.py:app_started=0.06 task_scheduler.py:app_started=0.12
2024-06-17 08:20:50,127 | sd | INFO | webui | Startup time: 9.88 torch=3.63 gradio=0.75 diffusers=0.31 libraries=1.01 extensions=0.51 face-restore=0.19 ui-en=0.13 ui-txt2img=0.07 ui-img2img=0.08 ui-control=0.29 ui-models=0.20 ui-settings=0.24 ui-extensions=0.49 ui-defaults=0.23 launch=0.12 api=0.10 app-started=0.18 checkpoint=1.22
2024-06-17 08:20:50,129 | sd | DEBUG | shared | Save: file="config.json" json=60 bytes=2345 time=0.003
2024-06-17 08:20:57,953 | sd | INFO | server | MOTD: N/A
2024-06-17 08:20:59,736 | sd | DEBUG | theme | UI themes available: type=Modern themes=31
2024-06-17 08:21:00,029 | sd | INFO | api | Browser session: user=None client=127.0.0.1 agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:127.0) Gecko/20100101 Firefox/127.0
2024-06-17 08:21:59,712 | sd | DEBUG | launch | Server: alive=True jobs=1 requests=268 uptime=74 memory=1.14/31.93 backend=Backend.DIFFUSERS state=idle
2024-06-17 08:23:22,207 | sd | DEBUG | run | Control process unit: i=1 process=None
2024-06-17 08:23:22,220 | sd | INFO | processing_diffusers | Base: class=StableDiffusionPipeline
2024-06-17 08:23:22,224 | sd | DEBUG | sd_samplers | Sampler: sampler="DPM++ 2M" config={'num_train_timesteps': 1000, 'beta_start': 0.00085, 'beta_end': 0.012, 'beta_schedule': 'scaled_linear', 'prediction_type': 'epsilon', 'thresholding': False, 'sample_max_value': 1.0, 'algorithm_type': 'dpmsolver++', 'solver_type': 'midpoint', 'lower_order_final': True, 'use_karras_sigmas': True, 'final_sigmas_type': 'zero', 'timestep_spacing': 'linspace', 'solver_order': 2}
2024-06-17 08:23:22,593 | sd | DEBUG | processing_helpers | Torch generator: device=cpu seeds=[2077700554]
2024-06-17 08:23:22,594 | sd | DEBUG | processing_args | Diffuser pipeline: StableDiffusionPipeline task=DiffusersTaskType.TEXT_2_IMAGE batch=1/1x1 set={'prompt_embeds': torch.Size([1, 77, 768]), 'negative_prompt_embeds': torch.Size([1, 77, 768]), 'guidance_scale': 6.4, 'num_inference_steps': 32, 'eta': 1.0, 'guidance_rescale': 0.7, 'output_type': 'latent', 'width': 512, 'height': 640, 'parser': 'Full parser'}

Backend

Diffusers

Branch

Dev

Model

SD 1.5

Acknowledgements

[X] I have read the above and searched for existing issues
[X] I confirm that this is classified correctly and its not an extension issue

ThePixelDiffusionPirate commented 3 months ago

OK, I've gone through everything. I sometimes get an error with "with enabled Full precision for VAE (--no-half-vae)". But somehow it doesn't occur anymore. See attached logs. Then I tested Face Hires again, that only gives me errors and I'm back on Codeformer.

But the real blocker for me is " "show_progress_type": "Simple"," That's always worked up until now, but as soon as I select that, everything freezes. Freeze is the wrong word. The UI is still usable, I can click. But changes are no longer saved and the generation process stops at 6%. FaceHiresError.txt NoHalfVAE.txt

vladmandic commented 3 months ago

root cause seems to be auto-casting issue with torch-directml backend

│ C:\Users\TestMachine\AndreDiffusion\NextTest\automatic\modules\dml\amp\autocast_mode.py:15 in forward                  │
│   14 │   if not torch.dml.is_autocast_enabled:                                                                       │
│ ❱ 15 │   │   return op(*args, **kwargs)                                                                              │
│   16 │   args = list(map(cast, args))                                                                                │
RuntimeError: Cannot set version_counter for inference tensor

cc @lshqqytiger

lshqqytiger commented 3 months ago

Cannot set version_counter for inference tensor is torch-directml bug. Change Torch inference mode to none and try again.

ThePixelDiffusionPirate commented 3 months ago

"none" or no-grad" with "simple" not working, hangs on 3%. "Approximate" with "no-grad" works.

vladmandic / automatic