vladmandic / automatic

SD.Next: Advanced Implementation of Stable Diffusion and other Diffusion-based generative image models
https://github.com/vladmandic/automatic
GNU Affero General Public License v3.0
5.54k stars 408 forks source link

[Issue]: NameError: name 'amdsmi' is not defined #3181

Closed WrongFerret closed 4 months ago

WrongFerret commented 4 months ago

Issue Description

Hello, I just updated to the latest release of SD.Next and upgraded to ROCM 6.1.1 as I saw the newest version supported 6.1. I am using Linux Mint 21.3 on an AMD 7800XT. I am now having issues launching it, with the program not detecting AMDSMI. I've pasted the output of ./webui.sh --debug below. I've found a couple people using Automatic1111 and ComfyUI having similar issues here and here. I've recompiled amd_smi according to these directions and tried a fresh install of SD.Next but I am still getting this error. Any help would be greatly appreciated

Recompile I tried:

sudo apt install amd-smi-lib
cd /opt/rocm/share/amd_smi
python3 -m pip install --upgrade pip
python3 -m pip install --user .

Version Platform Description

12:40:54-389459 INFO     Version: app=sd.next updated=2024-05-29 hash=042cac88  
                         branch=master                                          
                         url=https://github.com/vladmandic/automatic/tree/master
12:40:54-836097 INFO     Latest published version:                              
                         d19fc8c7ab9751f11cf1b648818e095ee4a2c2e9               
                         2024-05-29T15:59:32Z                                   
12:40:54-839715 INFO     Platform: arch=x86_64 cpu=x86_64 system=Linux          
                         release=6.5.0-35-generic python=3.10.12 

inxi -b output:

  Host: John Kernel: 6.5.0-35-generic x86_64 bits: 64 Desktop: Cinnamon 6.0.4
    Distro: Linux Mint 21.3 Virginia
Machine:
  Type: Desktop System: Gigabyte product: B550 UD AC-Y1 v: -CF
    serial: <superuser required>
  Mobo: Gigabyte model: B550 UD AC-Y1 v: x.x serial: <superuser required>
    UEFI: American Megatrends LLC. v: FD date: 06/08/2023
Battery:
  ID-1: hidpp_battery_0 charge: 98% condition: N/A
CPU:
  Info: 8-core AMD Ryzen 7 5800X3D [MT MCP] speed (MHz): avg: 2189
    min/max: 2200/4549
Graphics:
  Device-1: AMD driver: amdgpu v: kernel
  Display: x11 server: X.Org v: 1.21.1.4 driver: X: loaded: amdgpu,ati
    unloaded: fbdev,modesetting,radeon,vesa gpu: amdgpu resolution:
    1: 2560x1440~60Hz 2: 3840x2160~60Hz 3: 1920x1080~60Hz
  OpenGL: renderer: AMD Radeon RX 7800 XT (navi32 LLVM 17.0.4 DRM 3.54
  6.5.0-35-generic)
    v: 4.6 Mesa 23.3.0-devel
Network:
  Device-1: Realtek RTL8821CE 802.11ac PCIe Wireless Network Adapter
    driver: rtw_8821ce
  Device-2: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet
    driver: r8169
Drives:
  Local Storage: total: 5.46 TiB used: 2.19 TiB (40.1%)
Info:
  Processes: 425 Uptime: 19m Memory: 31.25 GiB used: 4.69 GiB (15.0%)
  Shell: Bash inxi: 3.3.13

Relevant .bashrc lines

export HSA_OVERRIDE_GFX_VERSION=11.0.0
export HCC_AMDGPU_TARGET=gfx1100
export ROCM_PATH=/opt/rocm

Relevant log output

Activate python venv
Launch
12:33:09-691287 INFO     Starting SD.Next                                       
12:33:09-693627 INFO     Logger:                                                
                         file="/home/john/SDNext/automatic/sdnext.log"      
                         level=DEBUG size=64 mode=create                        
12:33:09-694472 INFO     Python 3.10.12 on Linux                                
12:33:09-746485 INFO     Version: app=sd.next updated=2024-05-29 hash=042cac88  
                         branch=master                                          
                         url=https://github.com/vladmandic/automatic/tree/master
12:33:10-214476 INFO     Latest published version:                              
                         d19fc8c7ab9751f11cf1b648818e095ee4a2c2e9               
                         2024-05-29T15:59:32Z                                   
12:33:10-221170 INFO     Platform: arch=x86_64 cpu=x86_64 system=Linux          
                         release=6.5.0-35-generic python=3.10.12                
12:33:10-222388 DEBUG    Setting environment tuning                             
12:33:10-223280 DEBUG    HF cache folder: /home/john/.cache/huggingface/hub 
12:33:10-224126 DEBUG    Torch allocator:                                       
                         "garbage_collection_threshold:0.70,max_split_size_mb:51
                         2"                                                     
12:33:10-225041 DEBUG    Torch overrides: cuda=False rocm=False ipex=False      
                         diml=False openvino=False                              
12:33:10-226147 DEBUG    Torch allowed: cuda=True rocm=True ipex=True diml=True 
                         openvino=True                                          
12:33:10-227419 DEBUG    Package not found: torch-directml                      
12:33:10-228270 INFO     AMD ROCm toolkit detected                              
12:33:10-256597 DEBUG    ROCm agents detected: ['gfx1100']                      
12:33:10-257338 DEBUG    ROCm agent used by default: idx=0 gpu=gfx1100          
                         arch=navi3x                                            
12:33:10-416933 DEBUG    ROCm version detected: 6.1                             
12:33:10-489264 DEBUG    Repository update time: Wed May 29 09:48:41 2024       
12:33:10-490120 INFO     Startup: standard                                      
12:33:10-490770 INFO     Verifying requirements                                 
12:33:10-495376 INFO     Verifying packages                                     
12:33:10-496143 INFO     Verifying submodules                                   
12:33:11-533664 DEBUG    Git detached head detected:                            
                         folder="extensions-builtin/sd-extension-chainner"      
                         reattach=main                                          
12:33:11-534757 DEBUG    Submodule: extensions-builtin/sd-extension-chainner /  
                         main                                                   
12:33:11-567476 DEBUG    Git detached head detected:                            
                         folder="extensions-builtin/sd-extension-system-info"   
                         reattach=main                                          
12:33:11-568356 DEBUG    Submodule: extensions-builtin/sd-extension-system-info 
                         / main                                                 
12:33:11-605585 DEBUG    Git detached head detected:                            
                         folder="extensions-builtin/sd-webui-agent-scheduler"   
                         reattach=main                                          
12:33:11-606456 DEBUG    Submodule: extensions-builtin/sd-webui-agent-scheduler 
                         / main                                                 
12:33:11-655896 DEBUG    Git detached head detected:                            
                         folder="extensions-builtin/sdnext-modernui"            
                         reattach=main                                          
12:33:11-656730 DEBUG    Submodule: extensions-builtin/sdnext-modernui / main   
12:33:11-687927 DEBUG    Git detached head detected:                            
                         folder="extensions-builtin/stable-diffusion-webui-rembg
                         " reattach=master                                      
12:33:11-688798 DEBUG    Submodule:                                             
                         extensions-builtin/stable-diffusion-webui-rembg /      
                         master                                                 
12:33:11-724031 DEBUG    Git detached head detected:                            
                         folder="modules/k-diffusion" reattach=master           
12:33:11-724887 DEBUG    Submodule: modules/k-diffusion / master                
12:33:11-755907 DEBUG    Git detached head detected: folder="wiki"              
                         reattach=master                                        
12:33:11-756755 DEBUG    Submodule: wiki / master                               
12:33:11-801173 DEBUG    Register paths                                         
12:33:11-830556 DEBUG    Installed packages: 244                                
12:33:11-831167 DEBUG    Extensions all: ['sd-webui-agent-scheduler', 'Lora',   
                         'stable-diffusion-webui-images-browser',               
                         'sd-extension-system-info',                            
                         'stable-diffusion-webui-rembg',                        
                         'sd-extension-chainner', 'sdnext-modernui']            
12:33:11-831958 DEBUG    Running extension installer:                           
                         /home/john/SDNext/automatic/extensions-builtin/sd-w
                         ebui-agent-scheduler/install.py                        
12:33:11-987465 DEBUG    Running extension installer:                           
                         /home/john/SDNext/automatic/extensions-builtin/stab
                         le-diffusion-webui-images-browser/install.py           
12:33:12-119228 DEBUG    Running extension installer:                           
                         /home/john/SDNext/automatic/extensions-builtin/sd-e
                         xtension-system-info/install.py                        
12:33:12-253167 DEBUG    Running extension installer:                           
                         /home/john/SDNext/automatic/extensions-builtin/stab
                         le-diffusion-webui-rembg/install.py                    
12:33:12-431875 DEBUG    Extensions all: ['sd-webui-latent-regional-helper',    
                         'openpose-editor']                                     
12:33:12-480191 INFO     Extensions enabled: ['sd-webui-agent-scheduler',       
                         'Lora', 'stable-diffusion-webui-images-browser',       
                         'sd-extension-system-info',                            
                         'stable-diffusion-webui-rembg',                        
                         'sd-extension-chainner', 'sdnext-modernui',            
                         'sd-webui-latent-regional-helper', 'openpose-editor']  
12:33:12-481013 INFO     Verifying requirements                                 
12:33:12-484011 DEBUG    Setup complete without errors: 1717000392              
12:33:12-487895 DEBUG    Extension preload: {'extensions-builtin': 0.0,         
                         'extensions': 0.0}                                     
12:33:12-488770 DEBUG    Starting module: <module 'webui' from                  
                         '/home/john/SDNext/automatic/webui.py'>            
12:33:12-489614 INFO     Command line args: ['--debug'] debug=True              
12:33:12-490267 DEBUG    Env flags: []                                          
12:33:17-530133 INFO     Load packages: {'torch': '2.4.0.dev20240529+rocm6.1',  
                         'diffusers': '0.28.0', 'gradio': '3.43.2'}             
╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /home/john/SDNext/automatic/venv/lib/python3.10/site-packages/torch/cuda │
│                                                                              │
│    632     try:                                                              │
│ ❱  633         amdsmi.amdsmi_init()                                          │
│    634     except amdsmi.AmdSmiException as e:                               │
╰──────────────────────────────────────────────────────────────────────────────╯
NameError: name 'amdsmi' is not defined

During handling of the above exception, another exception occurred:

╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /home/john/SDNext/automatic/launch.py:264 in <module>                    │
│                                                                              │
│   263 if __name__ == "__main__":                                             │
│ ❱ 264     main()                                                             │
│   265                                                                        │
│                                                                              │
│ /home/john/SDNext/automatic/launch.py:241 in main                        │
│                                                                              │
│   240                                                                        │
│ ❱ 241     uv, instance = start_server(immediate=True, server=None)           │
│   242     while True:                                                        │
│                                                                              │
│ /home/john/SDNext/automatic/launch.py:168 in start_server                │
│                                                                              │
│   167     get_custom_args()                                                  │
│ ❱ 168     module_spec.loader.exec_module(server)                             │
│   169     uvicorn = None                                                     │
│ in exec_module:883                                                           │
│ in _call_with_frames_removed:241                                             │
│                                                                              │
│ /home/john/SDNext/automatic/webui.py:15 in <module>                      │
│                                                                              │
│    14 from installer import log, git_commit, custom_excepthook               │
│ ❱  15 import ldm.modules.encoders.modules # pylint: disable=unused-import, w │
│    16 from modules import shared, extensions, gr_tempdir, modelloader # pyli │
│                                                                              │
│ /home/john/SDNext/automatic/repositories/ldm/modules/encoders/modules.py │
│                                                                              │
│   135                                                                        │
│ ❱ 136 class ClipImageEmbedder(nn.Module):                                    │
│   137     def __init__(                                                      │
│                                                                              │
│ /home/john/SDNext/automatic/repositories/ldm/modules/encoders/modules.py │
│                                                                              │
│   140             jit=False,                                                 │
│ ❱ 141             device='cuda' if torch.cuda.is_available() else 'cpu',     │
│   142             antialias=True,                                            │
│                                                                              │
│ /home/john/SDNext/automatic/venv/lib/python3.10/site-packages/torch/cuda │
│                                                                              │
│    121         # fails, this assessment falls back to the default CUDA Runti │
│ ❱  122         return device_count() > 0                                     │
│    123     else:                                                             │
│                                                                              │
│ /home/john/SDNext/automatic/venv/lib/python3.10/site-packages/torch/cuda │
│                                                                              │
│    833     # bypass _device_count_nvml() if rocm (not supported)             │
│ ❱  834     nvml_count = _device_count_amdsmi() if torch.version.hip else _de │
│    835     r = torch._C._cuda_getDeviceCount() if nvml_count < 0 else nvml_c │
│                                                                              │
│ /home/john/SDNext/automatic/venv/lib/python3.10/site-packages/torch/cuda │
│                                                                              │
│    755         else:                                                         │
│ ❱  756             raw_cnt = _raw_device_count_amdsmi()                      │
│    757             if raw_cnt <= 0:                                          │
│                                                                              │
│ /home/john/SDNext/automatic/venv/lib/python3.10/site-packages/torch/cuda │
│                                                                              │
│    633         amdsmi.amdsmi_init()                                          │
│ ❱  634     except amdsmi.AmdSmiException as e:                               │
│    635         warnings.warn(f"Can't initialize amdsmi - Error code: {e.err_ │
╰──────────────────────────────────────────────────────────────────────────────╯
NameError: name 'amdsmi' is not defined

Backend

Diffusers

Branch

Master

Model

Other

Acknowledgements

Disty0 commented 4 months ago

Seems like that error is in PyTorch itself. Can you try downgrading to a stable release? You are using a nightly build because a stable build for ROCm 6.1 doesn't exist yet.

Activate the venv:

source venv/bin/activate
pip install torch==2.3.0+rocm6.0 torchvision==0.18.0+rocm6.0 --index-url https://download.pytorch.org/whl/rocm6.0
WrongFerret commented 4 months ago

Seems like that error is in PyTorch itself. Can you try downgrading to a stable release? You are using a nightly build because a stable build for ROCm 6.1 doesn't exist yet.

Activate the venv:

source venv/bin/activate
pip install torch==2.3.0+rocm6.0 torchvision==0.18.0+rocm6.0 --index-url https://download.pytorch.org/whl/rocm6.0

That was the issue. Thank you very much!

Delaunay commented 1 day ago

NOTE: Issue is still there for 2.4.1. Might be fixed in nightly.

Fix is to uninstall pynvml pip uninstall pynvml nvidia-ml-py -y