Open KEDI103 opened 1 year ago
What do you mean for "Other AIs"? other UIs for StableDiffusion? or something different, like oobabooga's text-generation-webui?
Anyway, i had issues too on my RX 5700XT with the webui and Pytorch2. as a workaround i kept it back for AMD cards, but then #10465 removed it. ....Wich isn't completely wrong, it makes no sense to keep all cards back if the problem is on older cards only.
I made a PR for another workaround, wich i hope makes everyone happy. ...But sadly it requires python 3.10
Menwhile, if you tell me wich AIs you were talking about i can investigate further and try to get a proper solution
Anyway, i had issues too on my RX 5700XT with the webui and Pytorch2. as a workaround i kept it back for AMD cards, but then #10465 removed it. ....Wich isn't completely wrong, it makes no sense to keep all cards back if the problem is on older cards only.
I made a PR for another workaround, wich i hope makes everyone happy. ...But sadly it requires python 3.10 #11048
Menwhile, if you tell me wich AIs you were talking about i can investigate further and try to get a proper solution
My radeon VII gfx_906 can work with even dev version for InvokAI tested with lastest dev version of pythourch
Also I noticed this disappear after upgrade pythourch for Webui
IOpen(HIP): Warning [SQLiteBase] Missing system database file: gfx906_60.kdb Performance may degrade. Please follow instructions to install: https://github.com/ROCmSoftwarePlatform/MIOpen#installing-miopen-kernels-package
It won't print this anymore when you generate your first my thought it won't recognize video card it fail to detect it. Also InvokeAI direcly detect my card with its name
Generate images with a browser-based interface
Initializing, be patient...
>> Initialization file /media/bcansin/1519f428-b947-449a-a54a-0aeab6646be3/home/b_cansin/InvokeAI-main/invokeai.init found. Loading...
>> Internet connectivity is True
>> InvokeAI, version 2.3.4.post1
>> InvokeAI runtime directory is "/media/bcansin/1519f428-b947-449a-a54a-0aeab6646be3/home/b_cansin/InvokeAI-main"
>> GFPGAN Initialized
>> CodeFormer Initialized
>> ESRGAN Initialized
>> Using device_type cuda
>> CUDA device 'AMD Radeon VII' (GPU 0)
And I also tried with other AI art generator won't get update for long time it also direly work with dev version of pytourch
But Webui won't want to work with it even its stable version of webui.sh typed one.
Ok, but you had just got the UI running or you actually got it to generate a image? I got InvokeAI to run too, and the card gest recognized, but it crashes with segmentation fault if i try to make something with my 5700XT
Ok, but you had just got the UI running or you actually got it to generate a image? I got InvokeAI to run too, and the card gest recognized, but it crashes with segmentation fault if i try to make something with my 5700XT
Yes I can generate with it even dev lastest pytorch. And your gpu not in rocm support list but mine is in list. https://rocm.docs.amd.com/en/latest/release/gpu_os_support.html Your gpu called gfx1010 which isn't in the rocm support list. But mine called gfx906 which still in support. But for segmentation fault some models crash it also happend to me too I got segmentation fault on Example DreamShaper - V6 baked vae give me segmentation fault with both webui and invokeai https://civitai.com/models/4384/dreamshaper but also I can get it work by --disable-nan-check but all of things get rendered black but with latest pytorch it works and using gpu but it all gone black.
Also still get same thing on 1.4.0 dev 59419bd Edit. Also I tested with Invoke it also made it black with it too. I thought I installed correctly of pytorch but I was wrong. Well I think its impossible to run pytorch 2 on gfx906.
gfx906 is not the only one affected: gfx1031 (RDNA2) also suffers from this exact issue.
Running on Fedora 38 with an Intel KBL-R system.
EDIT: possibly relevant and related is #10296
gfx906 is not the only one affected: gfx1031 (RDNA2) also suffers from this exact issue.
Running on Fedora 38 with an Intel KBL-R system.
EDIT: possibly relevant and related is #10296
AMD on their live stream invent pytorch hugface etc. but in application I can't see I typed it rocm offical and pythorch but noone from them not even reply.
This is last time I making mistake to buy AMD never gona happend on my next releases unless AMD fix this mess but instead of fixing they killing support of my card for ROCM in next releases so yeah nvidia looks so awesome to my eyes. And I have been buying AMD since 2005. I got enough of them nvidia can be expensive but AMD mean waste your all spend money not only money also your time to go trash right now. Even no windows support trying to battle with which amdgpu installer for my card or fight with installer hope its won't make atomic bomb to terminal in linux.... I mean this shouldn't be this hard for AMD my disappointment and regret level can't be type here....
Generate images with a browser-based interface Initializing, be patient...
Initialization file /media/bcansin/1519f428-b947-449a-a54a-0aeab6646be3/home/b_cansin/InvokeAI-main/invokeai.init found. Loading... Internet connectivity is True InvokeAI, version 2.3.4.post1 InvokeAI runtime directory is "/media/bcansin/1519f428-b947-449a-a54a-0aeab6646be3/home/b_cansin/InvokeAI-main" GFPGAN Initialized CodeFormer Initialized ESRGAN Initialized Using device_type cuda CUDA device 'AMD Radeon VII' (GPU 0)
How can I get Radeon VII be recgonized as CUDA? I am newbie, my issue is GPU device = cpu
invokeai --web [2023-08-08 13:34:56,875]::[InvokeAI]::INFO --> Patchmatch initialized /home/suus/invokeai/.venv/lib/python3.10/site-packages/torchvision/transforms/functional_tensor.py:5: UserWarning: The torchvision.transforms.functional_tensor module is deprecated in 0.15 and will be removed in 0.17. Please don't rely on it. You probably just need to use APIs in torchvision.transforms.functional or in torchvision.transforms.v2.functional. warnings.warn( [2023-08-08 13:34:58,352]::[uvicorn.error]::INFO --> Started server process [24789] [2023-08-08 13:34:58,352]::[uvicorn.error]::INFO --> Waiting for application startup. [2023-08-08 13:34:58,352]::[InvokeAI]::INFO --> InvokeAI version 3.0.1post3 [2023-08-08 13:34:58,353]::[InvokeAI]::INFO --> Root directory = /home/suus/invokeai [2023-08-08 13:34:58,354]::[InvokeAI]::INFO --> GPU device = cpu
have you come across this approach?
to make sure your GPU is being detected
Name: gfx1031
Marketing Name: AMD Radeon RX 6700 XT
should be somewhere in the output.
You'll note it says gfx1031 in mine - technically the 6700XT isn't usable with ROCm for some reason, but actually it is, so you run
export HSA_OVERRIDE_GFX_VERSION=10.3.0 to make the system lie about what GPU you have and boom, it just works. We'll cover how to make this persistent further down if you want that.
Lastly you want to add yourself to the render and video groups using
sudo usermod -a -G render
sudo apt-get install python3
then you want to edit your .bashrc file to make a shortcut (called an alias) to python3 when you type python - to do this you run
nano ~/.bashrc
alias python=python3 export HSA_OVERRIDE_GFX_VERSION=10.3.0 to the bottom of the file, and now your system will default to python3 instead,and makes the GPU lie persistant, neat.
I would like to follow up on this with https://github.com/pytorch/pytorch/issues/103973.
TL;DR is you need PCIe atomics support to get ROCm to work, even if said otherwise for post-Vega hardware. eGPU setups (even with integrated controllers) do not seem to expose the feature, basically requiring a full-fledged desktop with the PCIe x16 slot that is connected to the CPU.
It is still awkward how it all used to work with pytorch+ROCm 5.2, but AMD's documentation about atomics support has been pretty straightforward about it.
As mentioned here https://github.com/pytorch/pytorch/issues/106728, pytorch 2 works just fine if compiled on rocm 5.2, so i guess the problem here isn't about pytorch 1 vs 2, but it's about rocm 5.3 and newer breaking the support.
I would like to follow up on this with pytorch/pytorch#103973.
TL;DR is you need PCIe atomics support to get ROCm to work, even if said otherwise for post-Vega hardware. eGPU setups (even with integrated controllers) do not seem to expose the feature, basically requiring a full-fledged desktop with the PCIe x16 slot that is connected to the CPU.
It is still awkward how it all used to work with pytorch+ROCm 5.2, but AMD's documentation about atomics support has been pretty straightforward about it.
The pci atomics stuff is a good suggestion, but i don't think it's the case, at least for me. mi machine should be able to handle them. also, i tried to compile rocm using the new rocm 5.7 flag as described in the post you mentioned but it didn't seem to make any difference, while pytorch2 compiled on rocm5.2 is indeed working.
i opened a new issue in rocm's repo https://github.com/RadeonOpenCompute/ROCm/issues/2527
Well that's just great, PyTorch deleted their rocm5.2 repo.
Edit: Oops, my bad, it's a 3.10 specific repo.
Hmm, works fine with Python 3.11 and upstream PyTorch+rocm5.6 and TorchVision+rocm5.6, on gfx1031, if I specify the HSA GFX version override environment variable. Does not work with Arch's builds of pytorch or the AUR torchvision.
export HSA_OVERRIDE_GFX_VERSION=10.3.0
Not sure if there's a compatible override for 906 / 9.0.6. Maybe ask the ROCm repository?
https://github.com/pytorch/pytorch/issues/111355#issuecomment-1800257137
Okey heres after months of trying fix for missing pci atomics problem: https://github.com/pytorch/pytorch/issues/103973#issuecomment-1813214452 Gona be added on pre nighty build on next week for now that one fixing it. https://github.com/pytorch/pytorch/issues/103973#issuecomment-1816590164 After offical build release and without problem gona close this one But this needs also this fix for AUTOMATIC1111 https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/13985#issuecomment-1813885266
@KEDI103 You can use my Repo to install Stable diffusion on ROCm RX6000 it solves by other way AMD ROCm RDNA2 & 3 problems with docker containers on linux https://github.com/hqnicolas/StableDiffusionROCm it was stable at 1.9.3 (latest) if you like this automation REPO please let a Star on it ⭐
I am on Radeon VII and after 5 months of ROCm team we finally fixed but after tons of problem and unsupport thing for my radeon VII. This is my last AMD card until AMD do somethings than make us suffer so badly to regret buying AMD not gona buy AMD. I have been using AMD since 2005 or something but Radeon VII make me give up. So much problems early cut support not even win support still for pytorch etc....
I hope your repo helps suffering AMD users.
Is there an existing issue for this?
What happened?
Trying to make webui work with pytourch 2.0.1 + rocm 5.4.2 but It won't work.
Steps to reproduce the problem
What should have happened?
It should be generate like normal
Commit where the problem happens
b957dcf
What Python version are you running on ?
Python 3.10.x
What platforms do you use to access the UI ?
Linux
What device are you running WebUI on?
AMD GPUs (RX 6000 above), AMD GPUs (RX 5000 below)
What browsers do you use to access the UI ?
Mozilla Firefox
Command Line Arguments
List of extensions
No extra extensions directly from github
Console logs
Additional information
I got AMD Radeon™ VII ( GFX9 GPUs gfx906 Vega 20 ) and installed ROCM 5.5 And I only got problem with webui other AI works perfecly with even other AI in same system I use for webui work direcly with this: pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm5.4.2
Also I read this too. This is why I typing. https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/10465
Please fix it I really want to get last pythourch and rocm version I Stuck with this pip install torch==1.13.0+rocm5.2 torchvision==0.14.0+rocm5.2 torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/rocm5.2
I begging for help at this point. Please help me. I gave my days to make it work but every time I try I fail.