AUTOMATIC1111 / stable-diffusion-webui

Stable Diffusion web UI
GNU Affero General Public License v3.0
140.6k stars 26.61k forks source link

[Bug]: Crash when changing checkpoints #7992

Open LonelyMoose opened 1 year ago

LonelyMoose commented 1 year ago

Is there an existing issue for this?

What happened?

webui crashes when i try to change chekpoint

Steps to reproduce the problem

  1. run webui.sh
  2. change stable diffusion checkpoint
  3. crash

What should have happened?

it should have changed the checkpoint, i've used webui on same hardware a while ago and it worked

Commit where the problem happens

https://github.com/AUTOMATIC1111/stable-diffusion-webui/commit/0cc0ee1bcb4c24a8c9715f66cede06601bfc00c8

What platforms do you use to access the UI ?

Linux

What browsers do you use to access the UI ?

Mozilla Firefox

Command Line Arguments

PYTORCH_CUDA_ALLOC_CONF=garbage_collection_threshold:0.9,max_split_size_mb:512

List of extensions

No

Console logs

./webui.sh

################################################################
Install script for stable-diffusion + Web UI
Tested on Debian 11 (Bullseye)
################################################################

################################################################
Running on [username] user
################################################################

################################################################
Repo already cloned, using it as install directory
################################################################

################################################################
Create and activate python venv
################################################################

################################################################
Launching launch.py...
################################################################
Python 3.10.9 (main, Dec 19 2022, 17:35:49) [GCC 12.2.0]
Commit hash: 0cc0ee1bcb4c24a8c9715f66cede06601bfc00c8
Installing requirements for Web UI
Launching Web UI with arguments: 
No module 'xformers'. Proceeding without it.
Loading weights [fe4efff1e1] from /home/[username]/stable-diffusion-webui/models/Stable-diffusion/sd-v1-4.ckpt
Creating model from config: /home/[username]/stable-diffusion-webui/configs/v1-inference.yaml
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Applying cross attention optimization (Doggettx).
Textual inversion embeddings loaded(0): 
Model loaded in 6.2s (load weights from disk: 2.6s, create model: 0.6s, apply weights to model: 0.7s, apply half(): 0.5s, load VAE: 1.4s, move model to device: 0.4s).
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Loading weights [dcd690123c] from /home/[username]/stable-diffusion-webui/models/Stable-diffusion/v2-1_768-ema-pruned.safetensors
[1]    213967 killed     ./webui.sh

Additional information

No response

vladmandic commented 1 year ago

killed is typically a message from linux kernel deciding to kill a process because it went over the limits - most likely main memory.

have things changed in the last few months? sure, model loading is very different. and changing checkpoint causes a spike in memory until it settled back down.

LonelyMoose commented 1 year ago

i have a laptop and nothing has changed except for webui repo, haven't used it constantly so can'y pinpoint where it went wrong

horsaen commented 1 year ago

hi !! i've recently come across a solution — consider checking if a swap partition exists. i've found that if the swap partition is too small or nonexistent, the machine completely freezes.

there's a nice little chart in the 'How much swap do I need?' section here hope it helps :))

ClashSAN commented 1 year ago

this webui needs 16gb of ram, create swapfile if you have to, like so:

# Turn swap off
# This moves stuff in swap to the main memory and might take several minutes
sudo swapoff -a

# Create an empty swapfile
# Note that "1G" is basically just the unit and count is an integer.
# Together, they define the size. In this case 8GB.
sudo dd if=/dev/zero of=/swapfile bs=1G count=8

# Set the correct permissions
sudo chmod 0600 /swapfile

sudo mkswap /swapfile  # Set up a Linux swap area
sudo swapon /swapfile  # Turn the swap on

you can make the swap up to 3x the amount of your ram. but you might do better with less.

Lerc commented 1 year ago

I have been having this issue for a while (Linux 32gig ram, 12gig vram) I can change models a few times but after 3 or 4 switches It eats all the ram and gets killed.

yggdrasil75 commented 1 year ago

I have over 90 gb of ram. this only started happening within the last week or so. before that, I would not experience any issues switching models unless the model itself was broken. now I get crashes. I am on windows 10, not linux or wsl. this happens when I try switching to a safetensor almost every time, it happens when switching to a cktp only around half the time. but it does happen with both. interesting to note though, because safetensors are smaller than ckpts, so it isnt ram.

ClashSAN commented 1 year ago

well, I wouldn't know why, but you can try using the supermerger extension to unload models from ram if its your problem. check if its an extension issue, or a browser issue.

somenewaccountthen commented 1 year ago

I have had this for months. The KILLED thing. It's definitely resources because just before it gets killed my machine almost freezes. My mouse moves at 1px per 5 seconds. Then when the mouse speeds up again i can pretty much guarantee the shell with A1111 has the 'Killed' in it. I will start paying attention. My current assumption it got worse using Controlnet. So probably Controlnet is also impacting the same resource. (Probably memory as suggested above) You are probably using that now sometimes?

90Gb huh.. wtf You'd think that would be enough. Maybe it's some artificial limit set in the OS that prevents a process to grab more than X memory?

With regard to swap solution: Will it then start to swap? And won't that also be very unwelcome. Performance wise i mean.