[Bug]: Please help. My Textual Inversion Trainings are unbearably slow. Taking days instead of minutes. They used to be fine. What did I do wrong?

DEC1MU5 commented 10 months ago

Is there an existing issue for this?

[X] I have searched the existing issues and checked the recent builds/commits

What happened?

I created a new Automatic1111 install tonight with the latest version with no extensions or anything else I can think of to mess up my trainings. Updated Torch, Xformers, and PIP.

It's telling me things like it will take DAYS to train where it used to take 40 minutes to 4 hours.

My Specs are Win 10, AMD Ryzen 5900, Nvidia 3060ti 8GB, 64 GB System Ram.

My command line args are:

set COMMANDLINE_ARGS= --xformers --disable-nan-check --medvram

The TI I created is FILENAMEphoto of a woman with brunette hair

Number of vectors per token 8

any my training parameters are

New TI Created of a woman "FILENAME"embedding learning rate: 0.05:10, 0.02:20, 0.01:60, 0.005:200, 0.002:500, 0.001:3000, 0.0005hypernetwork learning rate 0.00001

gradient clipping disabled

batch size 8 gradient accumulation steps 3 (My processed dataset dir has 24 512x512 images 8x3=23)log directory "textual_inversion"prompt template is "Custom_Subject_filewords"

The template I use contains only this:a photo of a [name], [filewords]

width/height 512SD model loaded is V-15-pruned

do not resizes images unticked

max steps 3000

save image and save a embedding copy both set to 50

Use PNG alpha channel as loss weight UNTICKED

Save images with embedding in PNG chunks TICKED

Read parameters (prompt, etc...) from txt2img tab when making previews UNTICKED

Shuffle tags by ',' when creating prompts. TICKED

Drop out tags when creating prompts. 0.1

Choose latent sampling method (Deterministic)

IT takes like 10 minutes to do one step and the timer just gets longer and longer until it gets into the hundreds of hours.

looking like this.

Preparing dataset...

100%|██████████████████████████████████████████████████████████████████████████████████| 48/48 [00:02<00:00, 17.51it/s]

Training textual inversion [Epoch 1: 1/1] loss: 0.0945899: 0%| | 1/3000 [01:45<87:48:03, 105.40s/it]

It's driving me mad, what happened? It used to be fine.

if you need any more info I'll do my best to help you help me.

Thanks.

Steps to reproduce the problem

Go to .... Training 2set COMMANDLINE_ARGS= --xformers --disable-nan-check --medvram

The TI I created is FILENAMEphoto of a woman with brunette hair

Number of vectors per token 8

any my training parameters are

New TI Created of a woman "FILENAME"embedding learning rate: 0.05:10, 0.02:20, 0.01:60, 0.005:200, 0.002:500, 0.001:3000, 0.0005hypernetwork learning rate 0.00001

gradient clipping disabled

batch size 8 gradient accumulation steps 3 (My processed dataset dir has 24 512x512 images 8x3=23)log directory "textual_inversion"prompt template is "Custom_Subject_filewords"

The template I use contains only this:a photo of a [name], [filewords]

width/height 512SD model loaded is V-15-pruned

do not resizes images unticked

max steps 3000

save image and save a embedding copy both set to 50

Use PNG alpha channel as loss weight UNTICKED

Save images with embedding in PNG chunks TICKED

Read parameters (prompt, etc...) from txt2img tab when making previews UNTICKED

Shuffle tags by ',' when creating prompts. TICKED

Drop out tags when creating prompts. 0.1

Choose latent sampling method (Deterministic)

... training time just goes up and up

What should have happened?

training usually takes me 40 minutes to 8 hours depending on my dataset. Its now taking DAYS. about 10 minutes per step.

Version or Commit where the problem happens

version: v1.5.1 • python: 3.10.6 • torch: 2.0.1+cu118 • xformers: 0.0.20 • gradio: 3.32.0 • checkpoint: e1441589a6

What Python version are you running on ?

Python 3.10.x

What platforms do you use to access the UI ?

Windows

What device are you running WebUI on?

Nvidia GPUs (RTX 20 above)

Cross attention optimization

xformers

What browsers do you use to access the UI ?

Google Chrome

Command Line Arguments

COMMANDLINE_ARGS= --xformers --disable-nan-check --medvram

List of extensions

none. New install just for TI training.

Console logs

venv "C:\AI\stable-diffusion-webui 8-20\venv\Scripts\Python.exe"
Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug  1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Version: v1.5.1
Commit hash: 68f336bd994bed5442ad95bad6b6ad5564a5409a
Launching Web UI with arguments: --xformers --disable-nan-check --medvram
Loading weights [e1441589a6] from C:\AI\stable-diffusion-webui 8-20\models\Stable-diffusion\v1-5-pruned.ckpt
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 8.0s (launcher: 1.8s, import torch: 2.5s, import gradio: 0.7s, setup paths: 0.6s, other imports: 0.6s, list SD models: 0.4s, load scripts: 1.0s, create ui: 0.3s, gradio launch: 0.1s).
Creating model from config: C:\AI\stable-diffusion-webui 8-20\configs\v1-inference.yaml
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Applying attention optimization: xformers... done.
Model loaded in 5.2s (load weights from disk: 3.0s, create model: 0.6s, apply weights to model: 0.7s, apply half(): 0.7s, calculate empty prompt: 0.2s).
Traceback (most recent call last):
  File "C:\AI\stable-diffusion-webui 8-20\venv\lib\site-packages\gradio\routes.py", line 422, in run_predict
    output = await app.get_blocks().process_api(
  File "C:\AI\stable-diffusion-webui 8-20\venv\lib\site-packages\gradio\blocks.py", line 1323, in process_api
    result = await self.call_function(
  File "C:\AI\stable-diffusion-webui 8-20\venv\lib\site-packages\gradio\blocks.py", line 1051, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "C:\AI\stable-diffusion-webui 8-20\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "C:\AI\stable-diffusion-webui 8-20\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "C:\AI\stable-diffusion-webui 8-20\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "C:\AI\stable-diffusion-webui 8-20\modules\textual_inversion\ui.py", line 11, in create_embedding
    filename = modules.textual_inversion.textual_inversion.create_embedding(name, nvpt, overwrite_old, init_text=initialization_text)
  File "C:\AI\stable-diffusion-webui 8-20\modules\textual_inversion\textual_inversion.py", line 298, in create_embedding
    assert not os.path.exists(fn), f"file {fn} already exists"
AssertionError: file C:\AI\stable-diffusion-webui 8-20\embeddings\SUBJECT578-V8-2.pt already exists
Traceback (most recent call last):
  File "C:\AI\stable-diffusion-webui 8-20\venv\lib\site-packages\gradio\routes.py", line 422, in run_predict
    output = await app.get_blocks().process_api(
  File "C:\AI\stable-diffusion-webui 8-20\venv\lib\site-packages\gradio\blocks.py", line 1323, in process_api
    result = await self.call_function(
  File "C:\AI\stable-diffusion-webui 8-20\venv\lib\site-packages\gradio\blocks.py", line 1051, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "C:\AI\stable-diffusion-webui 8-20\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "C:\AI\stable-diffusion-webui 8-20\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "C:\AI\stable-diffusion-webui 8-20\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "C:\AI\stable-diffusion-webui 8-20\modules\textual_inversion\ui.py", line 11, in create_embedding
    filename = modules.textual_inversion.textual_inversion.create_embedding(name, nvpt, overwrite_old, init_text=initialization_text)
  File "C:\AI\stable-diffusion-webui 8-20\modules\textual_inversion\textual_inversion.py", line 298, in create_embedding
    assert not os.path.exists(fn), f"file {fn} already exists"
AssertionError: file C:\AI\stable-diffusion-webui 8-20\embeddings\SUBJECT578-V8-2.pt already exists
Calculating sha256 for C:\AI\stable-diffusion-webui 8-20\embeddings\SUBJECT578-V8-3.pt: a6d24eade85e7d894ff3b8139d1c5ac3689bd2c3f08adccbff310b8c956b3898
Training at rate of 0.05 until step 10
Preparing dataset...
100%|██████████████████████████████████████████████████████████████████████████████████| 48/48 [00:02<00:00, 16.37it/s]
Training textual inversion [Epoch 2: 1/1] loss: 0.1649854:   0%|                  | 2/3000 [03:39<92:11:24, 110.70s/it]

Additional information

It just takes unbelievably long, when it used to take a normal amount of time for my Specs.

4eJIoBek1 commented 10 months ago

Lol seems like you forgot to remove --medvram argument

silverhammer751 commented 8 months ago

I've found that I sometimes have to restart my PC to get textual inversions to train faster. After a while (especially if I'm doing several in a row) it starts to slow down and needs to be reset.

engineer1978mlo commented 2 months ago

Got same problem here, did you manage to find out what it was?

reowasnotavailable commented 1 month ago

Same problem, same settings. 4070 Ti, 5 vectors, 15 pics, Batchsize 8 and Gradient step 2.

3h ~8%, that can't be right...

[edit] I went through every tutorial step again and recognized that in Settings -> Training the following was not checked:

Move VAE and CLIP to RAM when training if possible. Saves VRAM.
Use cross attention optimizations while training

After checking and saving this time it was applied and my ETA went down to 35-40mins.

AUTOMATIC1111 / stable-diffusion-webui