SDXL. Google Colab/Kaggle terminates the session due to running out of RAM

rocketpal commented 1 year ago

Is there an existing issue for this?

[X] I have searched the existing issues and checked the recent builds/commits

What happened?

After running of gradio interface Google Colab/Kaggle terminates the session due to running out of RAM or vRAM if use '--lowram'

Steps to reproduce the problem

Install requirements and repository
Download model
Run SD with any parameters

What should have happened?

It just should work

Version or Commit where the problem happens

f865d3e

What Python version are you running on ?

Python 3.10.x

What platforms do you use to access the UI ?

Other/Cloud

What device are you running WebUI on?

Other GPUs

Cross attention optimization

Automatic

What browsers do you use to access the UI ?

Google Chrome

Command Line Arguments

python launch.py --share --api --disable-safe-unpickle --enable-insecure-extension-access --opt-sdp-attention --disable-console-progressbars --no-download-sd-model --lowvram --no-half-vae --lowram --no-hashing

List of extensions

-

Console logs

Python 3.10.12 (main, Jun  7 2023, 12:45:35) [GCC 9.4.0]
Version: v1.4.1-201-g14cf434b
Commit hash: 14cf434bc36d0ef31f31d4c6cd2bd15d7857d5c8
Installing clip
Cloning Stable Diffusion into /content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai...
Cloning into '/content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai'...
remote: Enumerating objects: 574, done.
remote: Counting objects: 100% (311/311), done.
remote: Compressing objects: 100% (92/92), done.
remote: Total 574 (delta 244), reused 219 (delta 219), pack-reused 263
Receiving objects: 100% (574/574), 73.43 MiB | 21.47 MiB/s, done.
Resolving deltas: 100% (276/276), done.
Cloning Stable Diffusion XL into /content/stable-diffusion-webui/repositories/generative-models...
Cloning into '/content/stable-diffusion-webui/repositories/generative-models'...
remote: Enumerating objects: 169, done.
remote: Counting objects: 100% (65/65), done.
remote: Compressing objects: 100% (40/40), done.
remote: Total 169 (delta 40), reused 25 (delta 25), pack-reused 104
Receiving objects: 100% (169/169), 18.21 MiB | 5.57 MiB/s, done.
Resolving deltas: 100% (66/66), done.
Cloning K-diffusion into /content/stable-diffusion-webui/repositories/k-diffusion...
Cloning into '/content/stable-diffusion-webui/repositories/k-diffusion'...
remote: Enumerating objects: 731, done.
remote: Counting objects: 100% (7/7), done.
remote: Compressing objects: 100% (6/6), done.
remote: Total 731 (delta 1), reused 3 (delta 1), pack-reused 724
Receiving objects: 100% (731/731), 143.11 KiB | 6.81 MiB/s, done.
Resolving deltas: 100% (478/478), done.
Cloning CodeFormer into /content/stable-diffusion-webui/repositories/CodeFormer...
Cloning into '/content/stable-diffusion-webui/repositories/CodeFormer'...
remote: Enumerating objects: 583, done.
remote: Counting objects: 100% (234/234), done.
remote: Compressing objects: 100% (89/89), done.
remote: Total 583 (delta 170), reused 161 (delta 145), pack-reused 349
Receiving objects: 100% (583/583), 17.30 MiB | 16.75 MiB/s, done.
Resolving deltas: 100% (281/281), done.
Cloning BLIP into /content/stable-diffusion-webui/repositories/BLIP...
Cloning into '/content/stable-diffusion-webui/repositories/BLIP'...
remote: Enumerating objects: 277, done.
remote: Counting objects: 100% (277/277), done.
remote: Compressing objects: 100% (123/123), done.
remote: Total 277 (delta 153), reused 247 (delta 151), pack-reused 0
Receiving objects: 100% (277/277), 7.03 MiB | 17.70 MiB/s, done.
Resolving deltas: 100% (153/153), done.
Installing requirements for CodeFormer
Installing requirements
Launching Web UI with arguments: --share --api --disable-safe-unpickle --enable-insecure-extension-access --opt-sdp-attention --disable-console-progressbars --no-download-sd-model --lowvram --no-half-vae --lowram --no-hashing
2023-07-17 15:32:18.705010: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-07-17 15:32:20.107813: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
no module 'xformers'. Processing without...
no module 'xformers'. Processing without...
No module 'xformers'. Proceeding without it.
Loading weights [None] from /content/stable-diffusion-webui/models/Stable-diffusion/sd_xl_base_0.9.safetensors
preload_extensions_git_metadata for 7 extensions took 0.00s
Running on local URL:  http://127.0.0.1:7860
Running on public URL: https://f4af1c3b65ac98ecb5.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades (NEW!), check out Spaces: https://huggingface.co/spaces
Startup time: 12.6s (import gradio: 0.6s, other imports: 1.7s, setup codeformer: 0.2s, load scripts: 1.6s, create ui: 1.6s, gradio launch: 6.7s, add APIs: 0.2s).
Creating model from config: /content/stable-diffusion-webui/repositories/generative-models/configs/inference/sd_xl_base.yaml
Downloading (…)olve/main/vocab.json: 100% 961k/961k [00:00<00:00, 2.96MB/s]
Downloading (…)olve/main/merges.txt: 100% 525k/525k [00:00<00:00, 57.9MB/s]
Downloading (…)cial_tokens_map.json: 100% 389/389 [00:00<00:00, 1.06MB/s]
Downloading (…)okenizer_config.json: 100% 905/905 [00:00<00:00, 3.25MB/s]
Downloading (…)lve/main/config.json: 100% 4.52k/4.52k [00:00<00:00, 18.9MB/s]

Additional information

I've tested different combinations of launch parameters. I did't have enough RAM even with 'lowram' parameters and GPU T4x2 (32gb). I also used different version of model official and sd_xl_refiner_0.9.safetensors. I tried this solution link, but still the same. You can check my colab file using this link:

https://drive.google.com/file/d/1MXvoXpsjIUpVSKyrnUftCprvqdEFUF8l/view?usp=sharing

bjornlarssen commented 1 year ago

On Colab I use the "High RAM" setting with T4/V100. Average RAM use for SDXL on Colab is 17-35 GB and sometimes still crashes… Also, VRAM doesn't get freed on model change, which is an absolute pain.

rocketpal commented 1 year ago

I don't see that settings in my free plan and v100 are not available for me. I tried fp16 version and got another error - black squares... it seems that it is impossible to run Automatic in google colab (with basic plan). At the same time ComfyUI works perfect.

infinit-X commented 1 year ago

+1

Kaneki010 commented 1 year ago

+1 even with the new medvramsdxl argument the session gets terminated right at 100%

Nekos4Lyfe commented 1 year ago

+1 Using SDXL on Colab with A1111 is impossible due to this issue. Help is appreciated

bjornlarssen commented 1 year ago

I switched to Vladmandic until this is fixed. It also has a memory leak, but with --medvram I can go on and on. With A1111 I used to be able to work with ONE SDXL model, as long as I kept the refiner in cache (after a while it would crash anyway). I have a weird config where I have both Vladmandic and A1111 installed and use the A1111 folder for everything, creating symbolic links for Vlad's, so it won't be very useful for anyone else – but it works for me.

(Until Vlad changes something every 2 minutes and then it stops ;) )

KorontosTheThird commented 1 year ago

+1

Kaneki010 commented 1 year ago

i think it's fixed guys

drimeF0 commented 1 year ago

+1

AUTOMATIC1111 / stable-diffusion-webui