invoke-ai / InvokeAI

Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The solution offers an industry leading WebUI, and serves as the foundation for multiple commercial products.
https://invoke-ai.github.io/InvokeAI/
Apache License 2.0
23.34k stars 2.4k forks source link

[bug]: #6976

Open GoDJr opened 1 week ago

GoDJr commented 1 week ago

Is there an existing issue for this problem?

Operating system

Linux

GPU vendor

AMD (ROCm)

GPU model

RTX 6900 XT

GPU VRAM

16 GB

Version number

5.0

Browser

firefox

Python dependencies

3.10

What happened

When i try and Invoke i get the following error: RuntimeError: HIP error: invalid device function HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing AMD_SERIALIZE_KERNEL=3 Compile with TORCH_USE_HIP_DSA to enable device-side assertions.

Comfy is able to work fine on current rocm setup.

What you expected to happen

generate an image

How to reproduce the problem

No response

Additional context

on the initial install it kept installing CPU pytorch so i uninstalled that and installed the ROCM compatibly pytorch on launch it does pick up my AMD radeon but it shoots an error saying bitsandbytes setup failed despite there being a CUDA compatible card. when i pythom -m bitsandbytes i get AttributeError: 'NoneType' object has no attribute 'split'. also it seems like im getting a patchmatch compile error. here is the log below:

Could not load bitsandbytes native library: 'NoneType' object has no attribute 'split' Traceback (most recent call last): File "/home/sagar/miniconda3/envs/invoke/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 109, in lib = get_native_library() File "/home/sagar/miniconda3/envs/invoke/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 88, in get_native_library cuda_specs = get_cuda_specs() File "/home/sagar/miniconda3/envs/invoke/lib/python3.10/site-packages/bitsandbytes/cuda_specs.py", line 39, in get_cuda_specs cuda_version_string=(get_cuda_version_string()), File "/home/sagar/miniconda3/envs/invoke/lib/python3.10/site-packages/bitsandbytes/cuda_specs.py", line 29, in get_cuda_version_string major, minor = get_cuda_version_tuple() File "/home/sagar/miniconda3/envs/invoke/lib/python3.10/site-packages/bitsandbytes/cuda_specs.py", line 24, in get_cuda_version_tuple major, minor = map(int, torch.version.cuda.split(".")) AttributeError: 'NoneType' object has no attribute 'split'

CUDA Setup failed despite CUDA being available. Please run the following command to get more information:

python -m bitsandbytes

Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues

patchmatch.patch_match: INFO - Compiling and loading c extensions from "/home/sagar/miniconda3/envs/invoke/lib/python3.10/site-packages/patchmatch". patchmatch.patch_match: ERROR - patchmatch failed to load or compile (Command 'make clean && make' returned non-zero exit status 2.). patchmatch.patch_match: INFO - Refer to https://invoke-ai.github.io/InvokeAI/installation/060_INSTALL_PATCHMATCH/ for installation instructions. [2024-09-28 00:10:44,825]::[InvokeAI]::INFO --> Patchmatch not loaded (nonfatal) [2024-09-28 00:10:45,939]::[InvokeAI]::INFO --> Using torch device: AMD Radeon Graphics [2024-09-28 00:10:46,169]::[InvokeAI]::INFO --> cuDNN version: 3001000 [2024-09-28 00:10:46,183]::[uvicorn.error]::INFO --> Started server process [38431]

Discord username

lordoflaziness

GoDJr commented 1 week ago

I was able to solve the bitsandbytes issue as well as the patchmatch issue. I also set the device id to my card so on boot it says torch device: AMD 6900XT. but every time i try to invoke i still get the same error message:

RuntimeError: HIP error: invalid device function HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing AMD_SERIALIZE_KERNEL=3 Compile with TORCH_USE_HIP_DSA to enable device-side assertions.