Open InkyZima opened 1 year ago
Last i knew, 5.19 kernel was not supported by ROCm. Try downgrading 5.17 kernel.
Actually, installling rocm 5.5 just released and supports kernel 5.19. You can try updating that on the host system.
very interesting! thanks a lot for the info. Will try ASAP, this weekend latest, and let you know. fingers crossed (:
btw, you are missing a ' at the end of the line in the readme in "Run on the command docker build . -t 'stable-diffusion-webui-rocm"
tried; didn't work with kernel 5.17.15 and rocm 5.4.2. it keeps producing NaNs / black images only. Regarding rocm 5.5: i don't know how to get that to work; i can install rocm 5.5 from amd on my host, but there is no torch rocm5.5.
How are installing rocm on the host system?
I'm using the deb Installation and the rocm packages are very picky. You can't just use mainline/urkuu and install a kernel of the 'supported' version.
With rocm 5.4.2, I had to install the kernel deb package, linux-oem-22.04
deb package. This will give rocm the 5.17 the package it is expect. Pytorch wants this version too.
With rocm 5.5, things get messier. Last I knew, pytorch only officially supported up to 5.4.2. They haven't added 5.4.3 or 5.5 support officially yet. I'm assuming rocm 5.5 is based off the linux-image-generic-hwe-22.04
deb kernel package. I'm testing it now. Can't say I'm holding my breath here. So we can try mixed versions. Not great, if it helps it could be helpful for people.
We really need my rocm development / testing. It feels like rocm is a second class citizen to cuda.
thanks for the info. ill try to spend some more time testing this weekend. Though it might be wise to just wait a few weeks until pytorch+rocm5.5 is out. Related: https://github.com/vladmandic/automatic/discussions/741#discussioncomment-5809102
I just updated the rocm5.5
branch. That loads the rocm 5.5 deb packages but still uses the SDW 5.4.2rocm build. I haven't had any issues with the mixed version so far.
You can easily build the image by using the command bash build.sh rocm5.5
and deploy it with the standard docker-compose command.
See how it works for you.
Hi, thanks for the effort; unfortunately no luck; same NaN error.
As a side note (Im sure there's a way to do this, Im just not Docker skilled enough); when wanting to change the COMMANDLINE_ARGS
(such as, for example to try and see if it works with --precision full --no-half
), i would edit the docker-compose.yml
(e.g. uncommenting that env variable), and that would lead to re-download of pytorch (that is 1.5GB of data) on next docker-compose up
, which is annoying. I think this could be avoided.
Thanks again for the effort.
That is sort of how the SDW application works. I can't really help that. That's inside the application. When you update the docker-compose, docker redeploys the whole container, so SDW can't find the previous download and thus redownload.
The other option is to make is part of the container image which isn't ideal.
Context: I am here from https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/5468. @hydrian i tried this repo / docker; it does not work for me. AMD RX 6800. clean Lubuntu host (5.19 kernel). I also tried --precision full, --no-half, "Upcast cross attention layer to float32". --disable-nan-check just produces black images.