d8ahazard / sd_dreambooth_extension

Other
1.85k stars 281 forks source link

[Bug]: webui.sh: line 255: 26147 Killed "${python_cmd}" -u "${LAUNCH_SCRIPT}" "$@"| 1/1 [00:04<00:00, 4.76s/it] #1366

Closed tensorain closed 9 months ago

tensorain commented 9 months ago

Is there an existing issue for this?

What happened?

It seems at the point of saving a checkpoint the training processes crashes with the webui.sh: line 255: 26147 Killed "${python_cmd}" -u "${LAUNCH_SCRIPT}" "$@"█████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:04<00:00, 4.76s/it] " line being output in terminal

Steps to reproduce the problem

  1. Install extension
  2. Run automatic111
  3. Attempt to dreambooth train sdxl with 30 images and batch size of 1 with any set of configuration settings or hyper-parameters on a 4090. (If any configuration, works please provide that configuration in response)

Commit and libraries

Starting at Initializing Dreambooth and ending several lines below at [+] bitsandbytes version 0.35.4 installed..

Command Line Arguments

No

Console logs

bash webui.sh███████████████▋                                                                                                   | 2/7 [00:02<00:07,  1.46s/it]

################################################################
Install script for stable-diffusion + Web UI
Tested on Debian 11 (Bullseye)
################################################################

################################################################
Running on sds user
################################################################

################################################################
Repo already cloned, using it as install directory
################################################################

################################################################
Create and activate python venv
################################################################

################################################################
Launching launch.py...
################################################################
Cannot locate TCMalloc (improves CPU memory usage)
Python 3.11.4 (main, Jul  5 2023, 14:15:25) [GCC 11.2.0]
Version: v1.6.0
Commit hash: 5ef669de080814067961f28357256e8fe27544f4
Installing requirements
If submitting an issue on github, please provide the full startup log for debugging purposes.

Initializing Dreambooth
Dreambooth revision: 1a1d1621086a4725fda1200256f319c845dc7a8a
Successfully installed accelerate-0.23.0 fastapi-0.94.1 transformers-4.32.1
[!] xformers version 0.0.20 installed.
[+] torch version 2.0.1+cu118 installed.
[+] torchvision version 0.15.2+cu118 installed.
[+] accelerate version 0.23.0 installed.
[+] diffusers version 0.21.4 installed.
[+] transformers version 4.32.1 installed.
[+] bitsandbytes version 0.41.1 installed.
Launching Web UI with arguments: --cors-allow-origins=* --xformers --opt-sdp-attention --no-half-vae --listen --api
Loading weights [31e35c80fc] from /home/sds/SD/stable-diffusion-webui/models/Stable-diffusion/sd_xl_base_1.0 (copy).safetensors
/home/sds/SD/stable-diffusion-webui/extensions/sd_dreambooth_extension/scripts/main.py:301: GradioDeprecationWarning: The `style` method is deprecated. Please set these arguments in the constructor instead.
  with gr.Row().style(equal_height=False):
Running on local URL:  http://0.0.0.0:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 22.5s (prepare environment: 18.9s, import torch: 1.1s, import gradio: 0.3s, setup paths: 0.6s, other imports: 0.4s, load scripts: 0.3s, create ui: 0.7s, gradio launch: 0.1s).
Creating model from config: /home/sds/SD/stable-diffusion-webui/repositories/generative-models/configs/inference/sd_xl_base.yaml
Applying attention optimization: xformers... done.
Model loaded in 10.2s (load weights from disk: 1.1s, create model: 0.2s, apply weights to model: 7.9s, calculate empty prompt: 0.9s).
Traceback (most recent call last):
  File "/home/sds/SD/stable-diffusion-webui/venv/lib/python3.11/site-packages/gradio/routes.py", line 488, in run_predict
    output = await app.get_blocks().process_api(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sds/SD/stable-diffusion-webui/venv/lib/python3.11/site-packages/gradio/blocks.py", line 1431, in process_api
    result = await self.call_function(
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sds/SD/stable-diffusion-webui/venv/lib/python3.11/site-packages/gradio/blocks.py", line 1103, in call_function
    prediction = await anyio.to_thread.run_sync(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sds/SD/stable-diffusion-webui/venv/lib/python3.11/site-packages/anyio/to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sds/SD/stable-diffusion-webui/venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
           ^^^^^^^^^^^^
  File "/home/sds/SD/stable-diffusion-webui/venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 807, in run
    result = context.run(func, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sds/SD/stable-diffusion-webui/venv/lib/python3.11/site-packages/gradio/utils.py", line 707, in wrapper
    response = f(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^
  File "/home/sds/SD/stable-diffusion-webui/modules/ui_extra_networks.py", line 392, in pages_html
    return refresh()
           ^^^^^^^^^
  File "/home/sds/SD/stable-diffusion-webui/modules/ui_extra_networks.py", line 400, in refresh
    ui.pages_contents = [pg.create_html(ui.tabname) for pg in ui.stored_extra_pages]
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sds/SD/stable-diffusion-webui/modules/ui_extra_networks.py", line 400, in <listcomp>
    ui.pages_contents = [pg.create_html(ui.tabname) for pg in ui.stored_extra_pages]
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sds/SD/stable-diffusion-webui/modules/ui_extra_networks.py", line 162, in create_html
    self.items = {x["name"]: x for x in self.list_items()}
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sds/SD/stable-diffusion-webui/modules/ui_extra_networks.py", line 162, in <dictcomp>
    self.items = {x["name"]: x for x in self.list_items()}
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sds/SD/stable-diffusion-webui/extensions-builtin/Lora/ui_extra_networks_lora.py", line 69, in list_items
    for index, name in enumerate(networks.available_networks):
RuntimeError: dictionary changed size during iteration
Custom model name is /home/sds/Desktop/Inst_Tag_Tune/Training_Sets_9_22/For_A1111_DB_Ext/Output_Dir
Initializing dreambooth training...
                                                                                                                                                                                                             Init dataset!set:   0%|                                                                                                                                                                  | 0/7 [00:00<?, ?it/s]
Preparing Dataset (With Caching)
Bucket 0 (456, 568, 0) - Instance Images:  1 | Class Images: 0 | Max Examples/batch:  1
Bucket 1 (584, 440, 0) - Instance Images:  3 | Class Images: 0 | Max Examples/batch:  3                                                                                                                       
Bucket 2 (624, 416, 0) - Instance Images:  4 | Class Images: 0 | Max Examples/batch:  4                                                                                                                       
Bucket 3 (680, 384, 0) - Instance Images: 11 | Class Images: 0 | Max Examples/batch: 11                                                                                                                       
Bucket 4 (720, 360, 0) - Instance Images: 11 | Class Images: 0 | Max Examples/batch: 11                                                                                                                       
                                                                                                                                                                                                             Saving cache!ed latents...: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30/30 [00:12<00:00,  2.81it/s]
Total Buckets 5 - Instance Images: 30 | Class Images: 0 | Max Examples/batch: 30

Total images / batch: 30, total examples: 30██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30/30 [00:12<00:00,  2.81it/s]
                  Initializing bucket counter!
Loading pipeline components...: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 142.13it/s]
Saving weights/samples...: : 0it [00:00, ?it/s]                                                                                                                                                              webui.sh: line 255: 26147 Killed                  "${python_cmd}" -u "${LAUNCH_SCRIPT}" "$@"█████████████████████

Additional information

No response

github-actions[bot] commented 9 months ago

This issue is stale because it has been open 5 days with no activity. Remove stale label or comment or this will be closed in 5 days