Running on Ubuntu VirtualBox

drago87 commented 8 months ago

I'm trying to run ComfyUI on Ubunto with Virtual box but when i run the main.py file with python3 main.py from within the comfyui folder i get

Traceback (most recent call last):
  File "/home/drago87/Desktop/AI/ComfyUI/main.py", line 72, in <module>
    import execution
  File "/home/drago87/Desktop/AI/ComfyUI/execution.py", line 12, in <module>
    import nodes
  File "/home/drago87/Desktop/AI/ComfyUI/nodes.py", line 20, in <module>
    import comfy.diffusers_load
  File "/home/drago87/Desktop/AI/ComfyUI/comfy/diffusers_load.py", line 4, in <module>
    import comfy.sd
  File "/home/drago87/Desktop/AI/ComfyUI/comfy/sd.py", line 5, in <module>
    from comfy import model_management
  File "/home/drago87/Desktop/AI/ComfyUI/comfy/model_management.py", line 114, in <module>
    total_vram = get_total_memory(get_torch_device()) / (1024 * 1024)
  File "/home/drago87/Desktop/AI/ComfyUI/comfy/model_management.py", line 83, in get_torch_device
    return torch.device(torch.cuda.current_device())
  File "/home/drago87/.local/lib/python3.10/site-packages/torch/cuda/__init__.py", line 785, in current_device
    _lazy_init()
  File "/home/drago87/.local/lib/python3.10/site-packages/torch/cuda/__init__.py", line 298, in _lazy_init
    torch._C._cuda_init()
RuntimeError: No HIP GPUs are available

i have a AMD RX 5700 XT and have installed the rocm5.7

drago87 commented 8 months ago

probebly a copy of issu-1707644920

San4itos commented 8 months ago

You have AMD RX 5700 XT but your guest Virtualbox system has VMSVGA or VboxVGA graphics card. Maybe that is the case.

drago87 commented 8 months ago

in the VM under Display i have acces to VBoxVGA, VMSVGA, VBoxSVGA and None VBoxVGA and VBoxSVGA i get a picture but cant interact with enything VMSVGA is what i have been using and "works" None gets a black screen

drago87 commented 8 months ago

I have caved and installed Ubuntu but now i get this error

Traceback (most recent call last):
  File "/home/drago87/AI/ComfyUI/main.py", line 72, in <module>
    import execution
  File "/home/drago87/AI/ComfyUI/execution.py", line 12, in <module>
    import nodes
  File "/home/drago87/AI/ComfyUI/nodes.py", line 20, in <module>
    import comfy.diffusers_load
  File "/home/drago87/AI/ComfyUI/comfy/diffusers_load.py", line 4, in <module>
    import comfy.sd
  File "/home/drago87/AI/ComfyUI/comfy/sd.py", line 5, in <module>
    from comfy import model_management
  File "/home/drago87/AI/ComfyUI/comfy/model_management.py", line 114, in <module>
    total_vram = get_total_memory(get_torch_device()) / (1024 * 1024)
  File "/home/drago87/AI/ComfyUI/comfy/model_management.py", line 83, in get_torch_device
    return torch.device(torch.cuda.current_device())
  File "/home/drago87/.local/lib/python3.10/site-packages/torch/cuda/__init__.py", line 785, in current_device
    _lazy_init()
  File "/home/drago87/.local/lib/python3.10/site-packages/torch/cuda/__init__.py", line 298, in _lazy_init
    torch._C._cuda_init()
RuntimeError: No HIP GPUs are available

after i get this fixed i only need to make a file on my Desktop to run it.

Can someone point me in the right direction

comfyanonymous commented 8 months ago

ROCm only works on native linux, it won't work on virtualbox or any other VM unless you do PCI passthrough.

drago87 commented 8 months ago

that was what i was afraid of. now i have installed it on a proper ubuntu install and installed everything but get the error

Traceback (most recent call last):
  File "/home/drago87/AI/ComfyUI/main.py", line 72, in <module>
    import execution
  File "/home/drago87/AI/ComfyUI/execution.py", line 12, in <module>
    import nodes
  File "/home/drago87/AI/ComfyUI/nodes.py", line 20, in <module>
    import comfy.diffusers_load
  File "/home/drago87/AI/ComfyUI/comfy/diffusers_load.py", line 4, in <module>
    import comfy.sd
  File "/home/drago87/AI/ComfyUI/comfy/sd.py", line 5, in <module>
    from comfy import model_management
  File "/home/drago87/AI/ComfyUI/comfy/model_management.py", line 33, in <module>
    import torch_directml
  File "/home/drago87/.local/lib/python3.10/site-packages/torch_directml/__init__.py", line 13, in <module>
    torch.ops.load_library(directml_dll)
  File "/home/drago87/.local/lib/python3.10/site-packages/torch/_ops.py", line 852, in load_library
    ctypes.CDLL(path)
  File "/usr/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libd3d12.so: cannot open shared object file: No such file or directory

Th3Rom3 commented 8 months ago

Did you follow the instructions to set up ROCm? https://rocm.docs.amd.com/en/latest/deploy/linux/install_overview.html

As a first step you could try and see what the rocminfo command reports back in order to check if ROCm has been set up correctly.

You can then also try to add the user executing the python script to the video and render usergroups: sudo usermod -a -G video $LOGNAME sudo usermod -a -G render $LOGNAME

This is what seems to have worked for me in the end.

drago87 commented 8 months ago

i have follow that now and no difference. and a search on the computer gives nothing on libd3d12 a search on d3d12 gives

d3d12_dri.so
d3d12_drv_video.so
libvdpau_d3d12.so
libvdpau_d3d12.so.1
libvdpau_d3d12.so.1.0
libvdpau_d3d12.so.1.0.0

in folders usr/lib/x86_64-linux-gnu and subfolders and 3 d3d12_dri.so in some snap/gnome* folders with 1 d3d12_drv_video.so in snap/gnome-42-2204/141/usr/lib/x86_64-linux-gnu/dri

Th3Rom3 commented 8 months ago

Just to double check, you are not accidentally trying to run the directml version? 5700XT should run via ROCm under native linux. It will probably need the override, though:

HSA_OVERRIDE_GFX_VERSION=10.3.0 python main.py

drago87 commented 8 months ago

i was running the --directml

HSA_OVERRIDE_GFX_VERSION=10.3.0 python3 main.py
gives
Traceback (most recent call last):
  File "/home/drago87/AI/ComfyUI/main.py", line 72, in <module>
    import execution
  File "/home/drago87/AI/ComfyUI/execution.py", line 12, in <module>
    import nodes
  File "/home/drago87/AI/ComfyUI/nodes.py", line 20, in <module>
    import comfy.diffusers_load
  File "/home/drago87/AI/ComfyUI/comfy/diffusers_load.py", line 4, in <module>
    import comfy.sd
  File "/home/drago87/AI/ComfyUI/comfy/sd.py", line 5, in <module>
    from comfy import model_management
  File "/home/drago87/AI/ComfyUI/comfy/model_management.py", line 114, in <module>
    total_vram = get_total_memory(get_torch_device()) / (1024 * 1024)
  File "/home/drago87/AI/ComfyUI/comfy/model_management.py", line 83, in get_torch_device
    return torch.device(torch.cuda.current_device())
  File "/home/drago87/.local/lib/python3.10/site-packages/torch/cuda/__init__.py", line 769, in current_device
    _lazy_init()
  File "/home/drago87/.local/lib/python3.10/site-packages/torch/cuda/__init__.py", line 298, in _lazy_init
    torch._C._cuda_init()
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx

Running lspci | grep VGA gives VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 [Radeon RX 5600 OEM/5600 XT / 5700/5700 XT] (rev c1)

Th3Rom3 commented 8 months ago

This might be the source of the error. Try installing the ROCm PyTorch (below is the rocm 5.7 nightly from the ComfyUI installation guide)

pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm5.7

And run ComfyUI with the override

HSA_OVERRIDE_GFX_VERSION=10.3.0 python main.py

drago87 commented 8 months ago

hmm i have already installed that one but if i try doing it agen i get this

Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://download.pytorch.org/whl/nightly/rocm5.7
Requirement already satisfied: torch in /home/drago87/.local/lib/python3.10/site-packages (2.1.0)
Requirement already satisfied: torchvision in /home/drago87/.local/lib/python3.10/site-packages (0.15.1)
Requirement already satisfied: torchaudio in /home/drago87/.local/lib/python3.10/site-packages (2.2.0.dev20231101+rocm5.7)
Requirement already satisfied: sympy in /home/drago87/.local/lib/python3.10/site-packages (from torch) (1.11.1)
Requirement already satisfied: nvidia-cusolver-cu12==11.4.5.107 in /home/drago87/.local/lib/python3.10/site-packages (from torch) (11.4.5.107)
Requirement already satisfied: nvidia-cublas-cu12==12.1.3.1 in /home/drago87/.local/lib/python3.10/site-packages (from torch) (12.1.3.1)
Requirement already satisfied: nvidia-cufft-cu12==11.0.2.54 in /home/drago87/.local/lib/python3.10/site-packages (from torch) (11.0.2.54)
Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.1.105 in /home/drago87/.local/lib/python3.10/site-packages (from torch) (12.1.105)
Requirement already satisfied: fsspec in /home/drago87/.local/lib/python3.10/site-packages (from torch) (2023.10.0)
Requirement already satisfied: nvidia-cuda-runtime-cu12==12.1.105 in /home/drago87/.local/lib/python3.10/site-packages (from torch) (12.1.105)
Requirement already satisfied: nvidia-curand-cu12==10.3.2.106 in /home/drago87/.local/lib/python3.10/site-packages (from torch) (10.3.2.106)
Requirement already satisfied: nvidia-cusparse-cu12==12.1.0.106 in /home/drago87/.local/lib/python3.10/site-packages (from torch) (12.1.0.106)
Requirement already satisfied: filelock in /home/drago87/.local/lib/python3.10/site-packages (from torch) (3.9.0)
Requirement already satisfied: triton==2.1.0 in /home/drago87/.local/lib/python3.10/site-packages (from torch) (2.1.0)
Requirement already satisfied: networkx in /home/drago87/.local/lib/python3.10/site-packages (from torch) (3.0rc1)
Requirement already satisfied: nvidia-nccl-cu12==2.18.1 in /home/drago87/.local/lib/python3.10/site-packages (from torch) (2.18.1)
Requirement already satisfied: nvidia-cudnn-cu12==8.9.2.26 in /home/drago87/.local/lib/python3.10/site-packages (from torch) (8.9.2.26)
Requirement already satisfied: typing-extensions in /home/drago87/.local/lib/python3.10/site-packages (from torch) (4.4.0)
Requirement already satisfied: nvidia-nvtx-cu12==12.1.105 in /home/drago87/.local/lib/python3.10/site-packages (from torch) (12.1.105)
Requirement already satisfied: nvidia-cuda-cupti-cu12==12.1.105 in /home/drago87/.local/lib/python3.10/site-packages (from torch) (12.1.105)
Requirement already satisfied: jinja2 in /home/drago87/.local/lib/python3.10/site-packages (from torch) (3.1.2)
Requirement already satisfied: nvidia-nvjitlink-cu12 in /home/drago87/.local/lib/python3.10/site-packages (from nvidia-cusolver-cu12==11.4.5.107->torch) (12.3.52)
Requirement already satisfied: numpy in /home/drago87/.local/lib/python3.10/site-packages (from torchvision) (1.24.1)
Requirement already satisfied: requests in /usr/lib/python3/dist-packages (from torchvision) (2.25.1)
Collecting torchvision
  Using cached https://download.pytorch.org/whl/nightly/rocm5.7/torchvision-0.17.0.dev20231102%2Brocm5.7-cp310-cp310-linux_x86_64.whl (65.6 MB)
Collecting torch
  Using cached https://download.pytorch.org/whl/nightly/rocm5.7/torch-2.2.0.dev20231102%2Brocm5.7-cp310-cp310-linux_x86_64.whl (1701.2 MB)
Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in /usr/lib/python3/dist-packages (from torchvision) (9.0.1)
Requirement already satisfied: pytorch-triton-rocm==2.1.0+34f8189eae in /home/drago87/.local/lib/python3.10/site-packages (from torch) (2.1.0+34f8189eae)
Collecting torchaudio
  Using cached https://download.pytorch.org/whl/nightly/rocm5.7/torchaudio-2.2.0.dev20231102%2Brocm5.7-cp310-cp310-linux_x86_64.whl (1.7 MB)
Requirement already satisfied: MarkupSafe>=2.0 in /home/drago87/.local/lib/python3.10/site-packages (from jinja2->torch) (2.1.2)
Requirement already satisfied: mpmath>=0.19 in /home/drago87/.local/lib/python3.10/site-packages (from sympy->torch) (1.2.1)
ERROR: Exception:
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/pip/_internal/cli/base_command.py", line 165, in exc_logging_wrapper
    status = run_func(*args)
  File "/usr/lib/python3/dist-packages/pip/_internal/cli/req_command.py", line 205, in wrapper
    return func(self, options, args)
  File "/usr/lib/python3/dist-packages/pip/_internal/commands/install.py", line 389, in run
    to_install = resolver.get_installation_order(requirement_set)
  File "/usr/lib/python3/dist-packages/pip/_internal/resolution/resolvelib/resolver.py", line 188, in get_installation_order
    weights = get_topological_weights(
  File "/usr/lib/python3/dist-packages/pip/_internal/resolution/resolvelib/resolver.py", line 276, in get_topological_weights
    assert len(weights) == expected_node_count
AssertionError

Th3Rom3 commented 8 months ago

Sadly I am no expert in the Linux environment. Looks like sth might be wrong with the local Python installation. Someone more adept might be able to help. Maybe a force-reinstall could help.

Just keep in mind that the torch packages for CUDA, ROCm, and DirectML are not the same within the same python environment and they will replace some components with one another.

BTW, checking rocminfo might still be worthwhile to check if the GPU side of things is working correctly.

drago87 commented 8 months ago

ROCk module is loaded
=====================    
HSA System Attributes    
=====================    
Runtime Version:         1.1
System Timestamp Freq.:  1000.000000MHz
Sig. Max Wait Duration:  18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model:           LARGE                              
System Endianness:       LITTLE                             
Mwaitx:                  DISABLED
DMAbuf Support:          YES

==========               
HSA Agents               
==========               
*******                  
Agent 1                  
*******                  
  Name:                    AMD Ryzen 5 3600X 6-Core Processor 
  Uuid:                    CPU-XX                             
  Marketing Name:          AMD Ryzen 5 3600X 6-Core Processor 
  Vendor Name:             CPU                                
  Feature:                 None specified                     
  Profile:                 FULL_PROFILE                       
  Float Round Mode:        NEAR                               
  Max Queue Number:        0(0x0)                             
  Queue Min Size:          0(0x0)                             
  Queue Max Size:          0(0x0)                             
  Queue Type:              MULTI                              
  Node:                    0                                  
  Device Type:             CPU                                
  Cache Info:              
    L1:                      32768(0x8000) KB                   
  Chip ID:                 0(0x0)                             
  ASIC Revision:           0(0x0)                             
  Cacheline Size:          64(0x40)                           
  Max Clock Freq. (MHz):   3800                               
  BDFID:                   0                                  
  Internal Node ID:        0                                  
  Compute Unit:            12                                 
  SIMDs per CU:            0                                  
  Shader Engines:          0                                  
  Shader Arrs. per Eng.:   0                                  
  WatchPts on Addr. Ranges:1                                  
  Features:                None
  Pool Info:               
    Pool 1                   
      Segment:                 GLOBAL; FLAGS: FINE GRAINED        
      Size:                    32766012(0x1f3f83c) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
    Pool 2                   
      Segment:                 GLOBAL; FLAGS: KERNARG, FINE GRAINED
      Size:                    32766012(0x1f3f83c) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
    Pool 3                   
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    32766012(0x1f3f83c) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
  ISA Info:                
*******                  
Agent 2                  
*******                  
  Name:                    gfx1010                            
  Uuid:                    GPU-XX                             
  Marketing Name:          AMD Radeon RX 5700 XT              
  Vendor Name:             AMD                                
  Feature:                 KERNEL_DISPATCH                    
  Profile:                 BASE_PROFILE                       
  Float Round Mode:        NEAR                               
  Max Queue Number:        128(0x80)                          
  Queue Min Size:          64(0x40)                           
  Queue Max Size:          131072(0x20000)                    
  Queue Type:              MULTI                              
  Node:                    1                                  
  Device Type:             GPU                                
  Cache Info:              
    L1:                      16(0x10) KB                        
    L2:                      4096(0x1000) KB                    
  Chip ID:                 29471(0x731f)                      
  ASIC Revision:           2(0x2)                             
  Cacheline Size:          64(0x40)                           
  Max Clock Freq. (MHz):   2100                               
  BDFID:                   3328                               
  Internal Node ID:        1                                  
  Compute Unit:            40                                 
  SIMDs per CU:            2                                  
  Shader Engines:          2                                  
  Shader Arrs. per Eng.:   2                                  
  WatchPts on Addr. Ranges:4                                  
  Features:                KERNEL_DISPATCH 
  Fast F16 Operation:      TRUE                               
  Wavefront Size:          32(0x20)                           
  Workgroup Max Size:      1024(0x400)                        
  Workgroup Max Size per Dimension:
    x                        1024(0x400)                        
    y                        1024(0x400)                        
    z                        1024(0x400)                        
  Max Waves Per CU:        40(0x28)                           
  Max Work-item Per CU:    1280(0x500)                        
  Grid Max Size:           4294967295(0xffffffff)             
  Grid Max Size per Dimension:
    x                        4294967295(0xffffffff)             
    y                        4294967295(0xffffffff)             
    z                        4294967295(0xffffffff)             
  Max fbarriers/Workgrp:   32                                 
  Packet Processor uCode:: 146                                
  SDMA engine uCode::      35                                 
  IOMMU Support::          None                               
  Pool Info:               
    Pool 1                   
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    8372224(0x7fc000) KB               
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       FALSE                              
    Pool 2                   
      Segment:                 GLOBAL; FLAGS:                     
      Size:                    8372224(0x7fc000) KB               
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       FALSE                              
    Pool 3                   
      Segment:                 GROUP                              
      Size:                    64(0x40) KB                        
      Allocatable:             FALSE                              
      Alloc Granule:           0KB                                
      Alloc Alignment:         0KB                                
      Accessible by all:       FALSE                              
  ISA Info:                
    ISA 1                    
      Name:                    amdgcn-amd-amdhsa--gfx1010:xnack-  
      Machine Models:          HSA_MACHINE_MODEL_LARGE            
      Profiles:                HSA_PROFILE_BASE                   
      Default Rounding Mode:   NEAR                               
      Default Rounding Mode:   NEAR                               
      Fast f16:                TRUE                               
      Workgroup Max Size:      1024(0x400)                        
      Workgroup Max Size per Dimension:
        x                        1024(0x400)                        
        y                        1024(0x400)                        
        z                        1024(0x400)                        
      Grid Max Size:           4294967295(0xffffffff)             
      Grid Max Size per Dimension:
        x                        4294967295(0xffffffff)             
        y                        4294967295(0xffffffff)             
        z                        4294967295(0xffffffff)             
      FBarrier Max Size:       32                                 
*** Done ***

Th3Rom3 commented 8 months ago

This part looks good. HSA_OVERRIDE_GFX_VERSION=10.3.0 python main.py should work now after getting the python package errors sorted.

drago87 commented 8 months ago

i manage to get it to work i was missing allot of package but i just followed it and installed them one by one.

Do you know somewhere i can follow to make a executable to run it from the desktop?

Th3Rom3 commented 8 months ago

First thing that comes to my mind is creating a .desktop entry. But there are probably more elegant ways.

But bottom line to summarize the issue: You are now able to generate images using your 5700XT via ROCm 5.7 under native Ubuntu, correct?

drago87 commented 8 months ago

well i'm able to start ComfyUI i'm getting this error with the bare minimum nodes

got prompt
ERROR:root:Failed to validate prompt for output 18:
ERROR:root:* KSampler 15:
ERROR:root:  - Required input is missing: latent_image
ERROR:root:Output will be ignored
invalid prompt: {'type': 'prompt_outputs_failed_validation', 'message': 'Prompt outputs failed validation', 'details': '', 'extra_info': {}}
got prompt
model_type EPS
adm 2816
Using split attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using split attention in VAE
missing {'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_l.text_projection'}
left over keys: dict_keys(['conditioner.embedders.1.model.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.transformer.text_model.embeddings.position_ids'])
Requested to load SDXLClipModel
Loading 1 new model
Memory access fault by GPU node-1 (Agent handle: 0x72e3f20) on address 0x7f3cc4ffd000. Reason: Page not present or supervisor privilege.
Aborted (core dumped)

comfyui.log

The first error i got i only forgot the empty latent

If it would help i can let you remote control my pc with AnyDesk i'm usually up from 10-11 in the morning to 3 at night cet (gmt+1) i'm not home this Saturday but else i should be home

Th3Rom3 commented 8 months ago

I am sorry but I won't give any support on that scale.

As for some ad hoc advice after a quick glance over the error msg: Start with a fresh ComfyUI without custom nodes. And I suggest trying a SD 1.5 based model first, before going for SDXL.

Also in this git issue there are some possible mitigation steps based on Automatic1111

drago87 commented 8 months ago

Ok will try that

comfyanonymous / ComfyUI

Running on Ubuntu VirtualBox #1870