Closed lavadrop closed 1 year ago
this error is crash deep inside libamdhip64.so
which is part of AMD ROCm libraries, not much higher level application can do about that. i suggest to try reinstalling rocm.
also, make sure that your gpu is on supported list for specific version of rocm as different versions of rocm support different gpus.
and if needed, set environment variable HSA_OVERRIDE_GFX_VERSION
to a correct value for your gpu.
sdnext tries to set it, but amd is notoriously bad in detecting capabilities of its own gpus from inside rocm.
you may get more information if you try to start webui --debug
, it will log a line like this:
log.debug(f'ROCm agent used by default: idx={idx} gpu={gpu} arch={arch}')
This is what I got from ./webui.sh --debug
17:31:57-130638 INFO Starting SD.Next
17:31:57-133540 INFO Python 3.10.12 on Linux
17:31:57-140238 INFO Version: app=sd.next updated=2023-09-20 hash=89ba8e3c url=https://github.com/vladmandic/automatic/tree/master
17:31:57-657869 INFO Platform: arch=x86_64 cpu=x86_64 system=Linux release=6.5.3-1-default python=3.10.12
17:31:57-659062 DEBUG Setting environment tuning
17:31:57-660042 DEBUG Torch overrides: cuda=False rocm=False ipex=False diml=False openvino=False
17:31:57-660925 DEBUG Torch allowed: cuda=True rocm=True ipex=True diml=True openvino=True
17:31:57-661885 INFO AMD ROCm toolkit detected
17:31:57-681716 DEBUG ROCm agents detected: ['gfx1101']
17:31:57-682488 DEBUG ROCm agent used by default: idx=0 gpu=gfx1101 arch=navi3x
17:31:57-722330 DEBUG ROCm version detected: 5.7
17:31:57-756009 DEBUG Repository update time: Wed Sep 20 06:39:56 2023
17:31:57-756831 DEBUG Previous setup time: Mon Sep 25 20:37:32 2023
17:31:57-757512 INFO Extensions: disabled=[]
17:31:57-758096 INFO Extensions: enabled=['LDSR', 'Lora', 'ScuNET', 'SwinIR', 'a1111-sd-webui-lycoris', 'clip-interrogator-ext', 'multidiffusion-upscaler-for-automatic1111',
'sd-extension-system-info', 'sd-webui-agent-scheduler', 'sd-webui-controlnet', 'stable-diffusion-webui-images-browser', 'stable-diffusion-webui-rembg'] extensions-builtin
17:31:57-759447 INFO Extensions: enabled=[] extensions
17:31:57-760015 DEBUG Latest extensions time: Mon Sep 25 20:37:27 2023
17:31:57-760604 DEBUG Timestamps: version:1695213596 setup:1695695852 extension:1695695847
17:31:57-761209 INFO No changes detected: Quick launch active
17:31:57-761743 INFO Verifying requirements
17:31:57-771072 INFO Verifying packages
17:31:57-772465 INFO Extensions: disabled=[]
17:31:57-773033 INFO Extensions: enabled=['LDSR', 'Lora', 'ScuNET', 'SwinIR', 'a1111-sd-webui-lycoris', 'clip-interrogator-ext', 'multidiffusion-upscaler-for-automatic1111',
'sd-extension-system-info', 'sd-webui-agent-scheduler', 'sd-webui-controlnet', 'stable-diffusion-webui-images-browser', 'stable-diffusion-webui-rembg'] extensions-builtin
17:31:57-774238 INFO Extensions: enabled=[] extensions
17:31:57-777279 INFO Extension preload: {'extensions-builtin': 0.0, 'extensions': 0.0}
17:31:57-778106 DEBUG Starting module: <module 'webui' from '/home/user/bin/vladSD/webui.py'>
17:31:57-778912 INFO Command line args: ['--debug'] debug=True
17:32:01-451931 DEBUG Loaded packages: torch=2.0.1+rocm5.4.2 diffusers=0.20.2 gradio=3.43.2
17:32:01-661394 DEBUG Reading: config.json len=14
17:32:01-662530 INFO Engine: backend=Backend.DIFFUSERS compute=rocm mode=no_grad device=cuda
17:32:01-663483 INFO Device: device=AMD Radeon Graphics n=1 hip=5.4.22803-474e8620
17:32:01-899500 DEBUG Entering start sequence
17:32:01-900580 DEBUG Initializing
17:32:01-901473 INFO Available VAEs: models/VAE items=0
17:32:01-902162 INFO Diffusers disabling uncompatible extensions: ['sd-webui-controlnet', 'multidiffusion-upscaler-for-automatic1111', 'a1111-sd-webui-lycoris']
17:32:01-902969 DEBUG Scanning diffusers cache: models/Diffusers models/Diffusers items=0 time=0.00s
17:32:01-903624 INFO Available models: models/Stable-diffusion items=0 time=0.00s
Download the default model? (y/N) n
17:32:04-382133 DEBUG Loading extensions
17:32:05-683755 INFO Extensions time: 1.30s { clip-interrogator-ext=0.39s Lora=0.13s sd-webui-agent-scheduler=0.22s stable-diffusion-webui-rembg=0.42s }
17:32:05-685696 DEBUG FS walk error: [Errno 2] No such file or directory: '/home/user/bin/vladSD/models/RealESRGAN' /home/user/bin/vladSD/models/RealESRGAN
17:32:05-686765 DEBUG Loaded upscalers: items=14
17:32:07-914723 INFO Loading UI theme: name=black-teal style=Auto
17:32:07-916130 DEBUG Loaded styles: folder=models/styles items=0
17:32:07-917759 DEBUG Creating UI
17:32:07-920207 DEBUG Reading: ui-config.json len=0
17:32:07-939492 DEBUG Extra networks: page='model' items=0 subdirs=1 tab=txt2img dirs=['models/Stable-diffusion', 'models/Diffusers', '/home/user/bin/vladSD/models/Stable-diffusion'] time=0.0
17:32:07-941337 DEBUG Extra networks: page='style' items=0 subdirs=0 tab=txt2img dirs=['models/styles'] time=0.0
17:32:07-942848 DEBUG Extra networks: page='embedding' items=0 subdirs=0 tab=txt2img dirs=['models/embeddings'] time=0.0
17:32:07-944569 DEBUG Extra networks: page='hypernetwork' items=0 subdirs=0 tab=txt2img dirs=['models/hypernetworks'] time=0.0
17:32:07-946445 DEBUG Extra networks: page='lora' items=0 subdirs=0 tab=txt2img dirs=['models/Lora'] time=0.0
17:32:08-056282 DEBUG Reading: ui-config.json len=0
17:32:08-080403 INFO Themes: builtin=6 default=5 external=54
17:32:08-319416 DEBUG Script: 0.18s ui_tabs /home/user/bin/vladSD/extensions-builtin/stable-diffusion-webui-images-browser/scripts/image_browser.py
17:32:08-321328 DEBUG Extensions list failed to load: /home/user/bin/vladSD/html/extensions.json
17:32:08-371907 DEBUG Extension list refresh: processed=12 installed=12 enabled=9 disabled=3 visible=12 hidden=0
17:32:08-697943 INFO Local URL: http://127.0.0.1:7860/
17:32:08-698829 DEBUG Gradio registered functions: 1442
17:32:08-699707 INFO Initializing middleware
17:32:08-702038 DEBUG Creating API
17:32:08-820313 INFO [AgentScheduler] Task queue is empty
17:32:08-821013 INFO [AgentScheduler] Registering APIs
17:32:08-891308 DEBUG Scripts setup: ['X/Y/Z Grid:0.005s']
17:32:08-892017 DEBUG Model metadata: metadata.json no changes
Segmentation fault (core dumped)
The latest rocm version is the one I installed and it explicitly lists compatibility with my GPU. From what I can tell from the log, rocm detects my GPU as gfx1101 which is correct (OpenGL renderer string: AMD Radeon Graphics (gfx1101, LLVM 16.0.6, DRM 3.54, 6.5.3-1-default))
one possible issue is that installed rocm drivers and torch that gets installed by default are too wide apart.
torch-rocm
package yet available for rocm 5.7
so matching of torch-rocm version fails and installs a fallback which is torch-rocm-5.4.2try forcing latest torch-rocm-5.6 manually by setting environment variable
TORCH_COMMAND="torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm5.6"
and either delete venv or force reinstall using --reinstall
flag.
Thanks. It appears torch-rocm 5.6 only added support for Radeon PRO W7900 and RX 7900 XTX which are Navi 31 gfx1100. I'd better wait.
Issue Description
I followed the instructions to configure the webui for using SDXL and after putting the HuggingFace SD-XL files in the models directory. I restarted the server which was stuck for a while. I pressed enter on the terminal and asked me if I wanted to download the base model, I pressed n then the server restarted and finally it wrote:
Segmentation fault (core dumped)
Version Platform Description
Version: 2023-09-20 Linux with rocm 5.7.0.50700-45~22.04 Firefox 117.0.1 no extensions AMD Radeon RX 7800 XT
Relevant log output