Closed ConfusedMerlin closed 1 year ago
@ConfusedMerlin
they are on an ext-4 fs, that also hosts about... everything else. I know, not the best setup, But after a dozen or so not quite working manuel partition setups I just stuffed everything on one big fs. EDiT: would that memmap/stream not be kind of reproducable with ease?
yeah, mmap is not an issue here. regardless of your partititioning, ext4 is definitely good enough - and yes, it would be reproducible.
Issue Description
Tried to install vladmandic's automatic on an Ubuntu yesterday to see, if the ROCm backend performs better than automatics1111's openML on windows.
It... kind of worked. After a lot of problems with the Python 3.8/3.10 versions, it finally started. I immediately issued a 512x512 test image (happy cat sitting on a computer), but was a bit disappointed when it claimed to need 4 Minutes to do it. The image appeared after said time.
Which is 7 times the windows openML counterpart needed. but then the CPU fan gave away, that not the GPU was thinking, but the CPU. The system monitor agreed with that observation, when it showed pretty graphs for all my cpu cores above 50%. This was astounding and concerning at the same time.
Astounding, because the openML automatic1111 version estimated 40m+ for that test image with CPU backend and clogged up my CPU with next to 100% for each core; your version had each core around 60% with a lot of fluctuations. Concerning, because I realized that the GPU was idle the whole time. Looking at the systeminfo page (thanks for including that!) I realized that the backend in use was called CPU.
I looked around the interwebs a bit; somebody here posted a similar issue some time ago (https://github.com/vladmandic/automatic/issues/816), but failed to offer the required log files. But there were some instructions inside this ticket, like "remove venv, delete setup.log". Which I did.
While I had a hickup at one try, where it failed to find the CLIP thingy (this didn't happen the next time), this does not resolve the issue. Also, there is no setup.log, as far as I can remember.
Still, the output during the startup sounds kind of promising, as it says "rocm roolkit detected" and stuff like that. But even with the --use-rocm switch, it falls back to GPU without a highly visible error message around.
As far as I can tell, the GPU should be ready to use; its kernel moduls are compiled and activated. But this being the first time I try to get an AMD GPU to run on Linux, I may draw wrong conclusions about this. But if you google "check if AMD GPU works on ubuntu", all answers are about "doing lspci" and stuff, which did after the drivers claimed to be installed. But I guess if you have a dedicated "try to to check if it works" test at hand, I will do that one too.
Finally... I am sorry, but I cannot offer logs right now. The test system being a new one, I managed to forget my gitlab pw yesterday evening, until gitlab locked the ip... Now I am at work, where I cannot access the test system (but the pw manager knows my password) I will add it to this ticket later this day.
Version Platform Description
ubuntu 20.04.5 (tried a 22.04 first, but the gpu driver installation failed... very hard; not your problem) python 3.10.12 (from that inofficial repo, with fitting pip, also keeping the 3.8 as alternative for ubuntu) radeon rx 7600, 23.10.3 for Ubuntu 20.04.5 HWE (see https://www.amd.com/en/support/linux-drivers) the firefiox that comes with ubuntu 20.04.5 (dunno which version that is)
the vladmandic is cloned fresh (yesterday evening), and the webui.sh seemed to have no problems getting its stuff.
Relevant log output
the changed webui.sh contains now this line (instead of only python3, which points to python 3.8, which was declared unsuppored somewhere during my first installation tries)
EDIT: Added log and console outpu
Acknowledgements