Open leucome opened 11 months ago
I'd look at system logs for the time you were asleep (journalctl
) to figure out if e.g. something was automatically updated in the background.
I'd look at system logs for the time you were asleep (
journalctl
) to figure out if e.g. something was automatically updated in the background.
I could not find anything helpful in the log.
So I tried deleting pip cache to force a re-download software installed by pip.
Reinstalled every single packages on the system with pacman.
Then I tried a new user.
I also tried with python 3.11.
So at least I know it is not caused by a damaged files or a wrong user setting.
So next I'll format then re-install the OS. The bug report was there long enough to have somebody confirming that they have the same issue if it was caused by an update. Also a1111 still work on my second computer on same OS. The only difference is that the other computer have a 6700xt. There still a small chance that it is caused by an update that affect only 7000GPU... I'll know for sure after re-installing the OS.
Though I still wonder what, why and how it can affect every a1111 version but none of the other stable-diffusion.
Finally completely re-installing the system worked. So it worked but I'll never know what was wrong.
just got the same thing when I updated today ... but a whole OS install ain't gonna happen, going to need to figure this out...
The same is happening to me. The problem started to appear right after an update of the extension dreambooth.
I tried dev branch as well, also failing. python3: undefined symbol: cudaRuntimeGetVersion
looks like a possible bug in bitsandbytes.
This is till happening randomly. Just rebooted my machine (working fine before, no updates no nothing and it happened again).
Also why is it closed?
Also why is it closed?
I closed it because it looked like a local issue with my system OS/config or something. But since then other people had really similar error messages after updating. So maybe it is really an update that brake something. So I guess I'll re-open.
Same issue, Pop!_os
I had an issue with an extension depending on bitsandbites removing it fixed boot for me
On Tue, 29 Aug 2023 at 8:06 pm, Dave Parr @.***> wrote:
Same issue, Pop!_os
— Reply to this email directly, view it on GitHub https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/12590#issuecomment-1697143630, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAABIXPRS47AI4PQ25VY4HLXXW5JXANCNFSM6AAAAAA3RRPOF4 . You are receiving this because you commented.Message ID: @.***>
Alright so after closing it and firing up A1111 again it crashed with another random library error. What fixed it for good was this:
export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libcudart.so
export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libcudart.so.11.5.117
export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu/:$LD_LIBRARY_PATH"
Seems to be some kind of dependency issue. Get your sorbet together, ML community :/
For people trying to decipher all the previous comments and just want a quick fix - in your stable-diffusion-webui, run:
pip uninstall bitsandbytes
And run ./webui.sh should work again.
But it's possible that one of the extensions is installing bitsandbytes. For me, it was the dreambooth extension. For a better fix, you need to figure out the directory where your cuda is installed and run:
export LD_LIBRARY_PATH="/<cuda dir>:$LD_LIBRARY_PATH"
If you are on a linux machine, it's likely somewhere in /usr/local
or /usr/lib
Ho yeah I do use Dreambooth extension to train Lora. So it is pretty sure that I had bitsandbytes installed.
Moved over to the dockerised versions. Seems to solve it currently.
Edit: This was most likely caused by an extension installing/updating bitsandbytes. See ruler88 comment for the a short explanation and possible fix.
What happened?
So I generated image then went to sleep then woke up and A1111 was not working anymore. While trying to diagnose... I noticed that all version installed also wont work anymore version 1.4 1.5.1 and Dev branch (bd4b4292ef6c2cb0a452b7105485ec06301b7531) and the 1.4 from the developer of Restart sampler.
Some version start but can not load any SD model while other just crash at launch with undefined symbol: cudaRuntimeGetVersion
Things I already tried:
Re-install from scratch version 1.4 1.5.1 an also the dev branch Re-Install ROCM5.5 Reinstall ROCm 5.6 install torch for 5.5 and torch for 5.6 Reset my entire system to a one week old state with Timeshift. Create the venv from a local installation of pyton 3.10 Create the venv from a miniconda installation of python 3.10 Re-install miniconda environment from scratch too in case. HSA_OVERRIDE_GFX_VERSION='11.0.0'
I was able to confirm that... Rocm with the 7900xt and PyTorch are definitively working. Comfy UI work fine and Vlad webui also work fine. It only affect A1111... So it is probably something that all these A1111 version share, Most likely something that can update itself at launch because it started by itself during nighttime without any manual update or reboot.
Seriously I am a bit confused.
Steps to reproduce the problem
sleep for a couple hours then hope it will brake itself.
What should have happened?
Ideally no self destruction.
Version or Commit where the problem happens
multiple version
What Python version are you running on ?
Python 3.10.x
What platforms do you use to access the UI ?
Linux
What device are you running WebUI on?
Other GPUs
Cross attention optimization
Automatic
What browsers do you use to access the UI ?
Mozilla Firefox
Command Line Arguments
List of extensions
I also tried with none.
Console logs
Additional information
My Linux Distro is Manjaro. You know in case it is an issue specific to Manjaro I doubt but who know.