Closed genelatham closed 4 months ago
Could you please try running pip install torch torchvision --force-reinstall --extra-index-url https://download.pytorch.org/whl/rocm5.6
and see if it fixes the issue? (please make sure you have the virtual environment activated when you do this. source /home/invoke/invokeai/.venv/bin/activate
). And then try running Invoke again.
If this still does not help, please delete the virtual environment (the /home/invoke/invokeai/.venv
folder), install again, and attach the complete console output of the install process.
Same here, tried the latest release (3.6.2) and the manual install, using both pytorch rocm5.6 and 5.4.2 on linux with a rx6650xt.
@sysbadmin to clarify - are you experiencing the same issue after trying the steps above?
@sysbadmin to clarify - are you experiencing the same issue after trying the steps above?
Yes
I did the install as requested. To establish the venv I used the developer's console, if that's not right I will try again. So, as requested I removed the .venv directory and reran the install. I had done a manual install before and so decided to do an automatic one hoping for better result. It ran for a very long time (my internet connection is limited). The requested log is attached. install.log
Thanks for the logs @genelatham. it's surprising to me that you're getting nvidia
and onnx
libraries installed. on the plus side, you have a lot of the requirements cached already, so the installation shouldn't use your bandwidth.
let's run through a very basic manual install and see if you get better results. Do not use the developer console, and do not activate any virtual environment prior to this:
# delete the virtual environment
rm -rf /home/invoke/invokeai/.venv/
# create a new virtual environment
python3 -m venv /home/invoke/invokeai/.venv
# activate it
source /home/invoke/invokeai/.venv/bin/activate
# install invoke
pip install "invokeai==3.6.2" --extra-index-url https://download.pytorch.org/whl/rocm5.4.2
# configure invoke (this *may* download new models, but you probably have some locally already)
invokeai-configure
# run invoke
invokeai-web
Please report back with the results. We appreciate you helping troubleshoot this!
"it's surprising to me that you're getting nvidia and onnx libraries installed" it kinda surprised me too. I don't know anything about how Cuda or ROCM work so I figured I just didn't understand.
When I ran the pip install it download a few module but only a few, maybe nightly updates. The invokeai-configure step did not offer me an option for AMD or ROCM so I left it on auto. I made no adjustments to the configuration. I did the log with the script command and the results are a little hard to read to me, if you have alternative suggestions, I can do it again. I think it's mostly the "operator entertainment" that is so ugly. (Actually it was not that bad using VS Code rather than cat.) Log is attached: manual.log
Just wanted to leave a note that I had this EXACT same issue, but fixed it by launching the dev console and reinstalling romc.
pip install --force-reinstall torch==2.1.0 --index-url https://download.pytorch.org/whl/rocm5.6
.
Thank you @harm0nic, --force-reinstall
is key here. @genelatham please give this a try.
I have done this several times. But nonetheless I copied the command directly from the above mail message. Whenever I run this command I get the following message at the end:
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. torchvision 0.16.2 requires torch==2.1.2, but you have torch 2.1.0+rocm5.6 which is incompatible. invokeai 3.6.2 requires torch==2.1.2, but you have torch 2.1.0+rocm5.6 which is incompatible. huggingface-hub 0.20.2 requires fsspec>=2023.5.0, but you have fsspec 2023.4.0 which is incompatible. Successfully installed MarkupSafe-2.1.3 filelock-3.9.0 fsspec-2023.4.0 jinja2-3.1.2 mpmath-1.3.0 networkx-3.2.1 pytorch-triton-rocm-2.1.0 sympy-1.12 torch-2.1.0+rocm5.6 typing-extensions-4.8.0
I don't know if this is an issue or not.
In any case, it still uses the CPU for generation.
For full disclosure I'm running InvokeAI on a OpenSuse LEAP 15 installation, so my solution may not work for you. But what helped me is doing the following:
pip install --force-reinstall torch==2.1.2+rocm5.6 --index-url https://download.pytorch.org/whl/rocm5.6
pip install --force-reinstall torchaudio==2.1.2+rocm5.6 --index-url https://download.pytorch.org/whl/rocm5.6
The only complaint I get after that is that, when the installation is finished it complains about a Huggingface library not being of the right version. But GPU generation works
@Shal-Ziar thanks for the suggestion: After running the suggested commands I got the following error message and it still uses the CPU.
/home/invoke/invokeai/.venv/lib/python3.10/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: 'libc10_cuda.so: cannot open shared object file: No such file or directory'If you don't plan on using image functionality from torchvision.io
, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have libjpeg
or libpng
installed before building torchvision
from source?
It appears to be looking for the cuda stuff. But I don't know if that is important.
I am often guilty of thinking things don't matter that do. So I thought I would add some background information.
First of all I use the setup I will describe with Easy Diffusion every day. ED installed and found the GPU on the first try. So, I thought the setup was good for stable diffusion, perhaps there is an issue.
So, the system runs as a guest on a Proxmox server. The GPU is a Radon RX 6700 XT. It is passed to the VM via PCI pass through. The CPU is a Ryzen 5 1600 (Six cores, 12 threads). It is passed through to the VM as native. The VM has 28 GB of memory. The VM is running Linux Mint 21.3 Virginia (which is based on Ubuntu 22.04). I installed ROCm version 6.0.0.
The VM sees the RX 6700 as as secondary graphics adapter.
So, while I don't think any of this matters since it works with Easy Diffusion; I wanted to put a system description in the bug report.
A note; when I am in the developer's console and I run: pip list|grep roc I get:
multiprocess 0.70.15 pytorch-triton-rocm 2.1.0 torch 2.1.2+rocm5.6 torchaudio 2.1.2+rocm5.6
If anymore info is needed, let me know.
@genelatham Have you by chance tried the fixes mentioned over in #4211 yet? This worked for me, specifically I had to do the reinstall fix mentioned here (and in the linked issue, they're the same thing it seems), and make sure that opencv
and python3-opencv
were installed, and finally I had to do the last comment of making sure that export HSA_OVERRIDE_GFX_VERSION=10.3.0
was ran before running invoke.sh
. I'm currently on Fedora 39, also with a Radeon RX 6700XT.
To make things easier, so I don't forget the variable export (otherwise you get a core dump/segfault when attempting to generate an image) I wrapped it in a launch.sh
script:
#!/bin/bash
export HSA_OVERRIDE_GFX_VERSION=10.3.0
bash invoke.sh
After those steps, I was able to get InvokeAI to work properly on my system using GPU acceleration.
@genaletham
I too get that error but the website starts for me and it works. Only had issues with controlnet, but that may be unrelated.
Sent from Outlook for Androidhttps://aka.ms/AAb9ysg
From: Russell @.> Sent: Saturday, February 3, 2024 11:48:12 PM To: invoke-ai/InvokeAI @.> Cc: M de Vries @.>; Mention @.> Subject: Re: [invoke-ai/InvokeAI] [bug]: Can't use AMD gpu (Issue #5599)
@genelathamhttps://github.com/genelatham Have you by chance tried the fixes mentioned over in #4211https://github.com/invoke-ai/InvokeAI/issues/4211 yet? This worked for me, specifically I had to do the reinstall fix mentioned here (and in the linked issue, they're the same thing it seems), and make sure that opencv and python3-opencv were installed, and finally I had to do the last comment of making sure that export HSA_OVERRIDE_GFX_VERSION=10.3.0 was ran before running invoke.sh. I'm currently on Fedora 39, also with a Radeon RX 6700XT.
To make things easier, so I don't forget the variable export (otherwise you get a core dump/segfault when attempting to generate an image) I wrapped it in a launch.sh script:
export HSA_OVERRIDE_GFX_VERSION=10.3.0 bash invoke.sh
After those steps, I was able to get InvokeAI to work properly on my system using GPU acceleration.
— Reply to this email directly, view it on GitHubhttps://github.com/invoke-ai/InvokeAI/issues/5599#issuecomment-1925476175, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADRX35MBWPMC2SAJLDJAKCTYR25CZAVCNFSM6AAAAABCQEUIDKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMRVGQ3TMMJXGU. You are receiving this because you were mentioned.Message ID: @.***>
@russjr08 Thanks for the suggestion. I tried it and it didn't work. I think I should start over with a fresh install and try some of the patches again.
@genelatham please let us know if you are still experiencing this issue
@ebr I can confirm that with the newest installer, it correctly initializes the AMD ROCm libraries and doesn't need the previous pip
interventions (which I previously needed to do).
Although for my card (Radeon 6700XT) I still have to run export HSA_OVERRIDE_GFX_VERSION=10.3.0
before launching InvokeAI, as while it will still detect it as a cuda device, without the variable being set it'll crash upon actually trying to run any generation with a core dumped message.
What is the latest version? Tomorrow is the soonest I could try it. I've been very busy with work and not had time to work on this.
@genelatham https://github.com/invoke-ai/InvokeAI/releases - v3.6.3 is the latest release as of this writing. No pressure to test this - please let us know anytime if you're still having an issue. We do believe this to be fixed.
Although for my card (Radeon 6700XT) I still have to run
export HSA_OVERRIDE_GFX_VERSION=10.3.0
before launching InvokeAI, as while it will still detect it as a cuda device, without the variable being set it'll crash upon actually trying to run any generation with a core dumped message.
This is great info @russjr08. Thank you!
@Millu: please see above - this seems like a good addition to Discord FAQs and the docs for that particular GPU.
Thanks Eugene. Unfortunately, I have hardware problems with the system. I expect to get them resolved late this week or next. Sorry.
On February 12, 2024 1:58:49 PM CST, Eugene Brodsky @.***> wrote:
@genelatham https://github.com/invoke-ai/InvokeAI/releases - v3.6.3 is the latest release as of this writing. No pressure to test this - please let us know anytime if you're still having an issue. We do believe this to be fixed.
-- Reply to this email directly or view it on GitHub: https://github.com/invoke-ai/InvokeAI/issues/5599#issuecomment-1939464704 You are receiving this because you were mentioned.
Message ID: @.***>
I finally got InvokeAi to use the AMD graphics card. I am somewhat embarrassed to say the problem was that I had not added the user to the groups render and video. I'm sorry. But it is something that should be pointed out to others that have this problem, so I will leave this note here.
I have other problems now. But I don't fully understand them yet.
Ah that happens, it's easy to forget steps. I have most everything working with my 6800xt so its possible to get it to work. Sadly I feel like the performance is still significantly less than an equivalent Nvidia card.
Hello People,
Tried InvokeAI on :
Ryzen 7 5800X 32Gb RAM DDR4 3600 GPU : RX6900XT Aorus Linux install is Bodhi Linux (funny distro , based Ubuntu 22.04)
Always device = cpu ...
tried all i read here and everywhere else ...
So I share my solution
(because same request on CPU is 10 minutes and 6 seconds on GPU ... )
I installed of course AMDGPU dirvers on SUDO ....
But I didn't add my user to the groups allowed to use the kernel ...
type "groups" ... see if you have "RENDER" and "VIDEO" ...
if not ... add them ... reboot ... let device on "AUTO" ... lauching the invoke server will show you the right GPU...
(ps I used as well the export line with GFX version stated on this thread)
hope this help
Is there an existing issue for this problem?
Operating system
Linux
GPU vendor
Apple Silicon (MPS)
GPU model
RX 6700 XT
GPU VRAM
12GB
Version number
3.6.2
Browser
Firefox 121.0
Python dependencies
No response
What happened
I installed and selected option 2 for AMD GPU. The GPU drivers are installed and other stable diffusion front ends can use it. I tried to reconfigure but no option was offered for AMD or ROCM (in option 5 of the main menu). I tried the fix from #5219. No difference. I tried the suggestions from #4202 also no change.
What you expected to happen
I expected it to use the AMD GPU
How to reproduce the problem
As stated above, I just did an install.
Additional context
I am running on Mint 21.3 which is essentially Ubuntu 22.04.
Hears what is in the log: Generate images with a browser-based interface
Discord username
@GeneL