[Bug]: RuntimeError: mat1 and mat2 shapes cannot be multiplied (22720x57 and 320x320)

Is there an existing issue for this?

[X] I have searched the existing issues and checked the recent builds/commits of both this extension and the webui

What happened?

Trying to train use Lora in dreambooth selecting 'Use Lora' and 'Use Lora Extended' I get the following error message:

RuntimeError: mat1 and mat2 shapes cannot be multiplied (22720x57 and 320x320)

training terminates on 0 steps.

I don't have this problem without Lora

Steps to reproduce the problem

Go to Dreambooth Tab
Press select use Lora and use Lora extended
train

Commit and libraries

Initializing Dreambooth Dreambooth revision: cf086c536b141fc522ff11f6cffc8b7b12da04b9 Successfully installed accelerate-0.21.0 fastapi-0.94.1 gitpython-3.1.36 transformers-4.30.2

Does your project take forever to startup? Repetitive dependency installation may be the reason. Automatic1111's base project sets strict requirements on outdated dependencies. If an extension is using a newer version, the dependency is uninstalled and reinstalled twice every startup.

[+] xformers version 0.0.20 installed. [+] torch version 2.0.1+cu118 installed. [+] torchvision version 0.15.2+cu118 installed. [+] accelerate version 0.21.0 installed. [+] diffusers version 0.19.3 installed. [+] transformers version 4.30.2 installed. [+] bitsandbytes version 0.35.4 installed.

Command Line Arguments

set COMMANDLINE_ARGS= --xformers --autolaunch --disable-nan-check --precision full --no-half  --opt-split-attention --upcast-sampling --medvram

Console logs

venv "D:\A1111\stable-diffusion-webui\venv\Scripts\Python.exe"
Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug  1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Version: v1.5.2
Commit hash: c9c8485bc1e8720aba70f029d25cba1c4abf2b5c
Installing requirements

If submitting an issue on github, please provide the full startup log for debugging purposes.

Initializing Dreambooth
Dreambooth revision: cf086c536b141fc522ff11f6cffc8b7b12da04b9
Successfully installed accelerate-0.21.0 fastapi-0.94.1 gitpython-3.1.36 transformers-4.30.2

Does your project take forever to startup?
Repetitive dependency installation may be the reason.
Automatic1111's base project sets strict requirements on outdated dependencies.
If an extension is using a newer version, the dependency is uninstalled and reinstalled twice every startup.

[+] xformers version 0.0.20 installed.
[+] torch version 2.0.1+cu118 installed.
[+] torchvision version 0.15.2+cu118 installed.
[+] accelerate version 0.21.0 installed.
[+] diffusers version 0.19.3 installed.
[+] transformers version 4.30.2 installed.
[+] bitsandbytes version 0.35.4 installed.

Launching Web UI with arguments: --xformers --autolaunch --disable-nan-check --precision full --no-half --opt-split-attention --upcast-sampling --medvram
[2023-09-16 00:18:54,157][DEBUG][git.cmd] - Popen(['git', 'version'], cwd=D:\A1111\stable-diffusion-webui, universal_newlines=False, shell=None, istream=None)
[2023-09-16 00:18:54,212][DEBUG][git.cmd] - Popen(['git', 'version'], cwd=D:\A1111\stable-diffusion-webui, universal_newlines=False, shell=None, istream=None)
2023-09-16 00:19:03,118 - ControlNet - INFO - ControlNet v1.1.313
ControlNet preprocessor location: D:\A1111\stable-diffusion-webui\extensions\sd-webui-controlnet\annotator\downloads
2023-09-16 00:19:03,933 - ControlNet - INFO - ControlNet v1.1.313
Loading weights [546263425f] from D:\A1111\stable-diffusion-webui\models\Stable-diffusion\NewKngMscl\NewKngMscl_13300.safetensors
Creating model from config: D:\A1111\stable-diffusion-webui\models\Stable-diffusion\NewKngMscl\NewKngMscl_13300.yaml
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Couldn't find VAE named vae-ft-ema-560000-ema-pruned.ckpt; using None instead
Model loaded in 60.9s (load weights from disk: 2.0s, create model: 0.4s, apply weights to model: 57.0s, calculate empty prompt: 1.4s).
[2023-09-16 00:20:06,447][DEBUG][api.py] - SD-Webui API layer loaded
Applying attention optimization: xformers... done.
[2023-09-16 00:20:07,168][DEBUG][markdown_it.rules_block.code] - entering code: StateBlock(line=0,level=0,tokens=0), 0, 1, False
[2023-09-16 00:20:07,168][DEBUG][markdown_it.rules_block.fence] - entering fence: StateBlock(line=0,level=0,tokens=0), 0, 1, False
[2023-09-16 00:20:07,169][DEBUG][markdown_it.rules_block.blockquote] - entering blockquote: StateBlock(line=0,level=0,tokens=0), 0, 1, False
[2023-09-16 00:20:07,170][DEBUG][markdown_it.rules_block.hr] - entering hr: StateBlock(line=0,level=0,tokens=0), 0, 1, False
[2023-09-16 00:20:07,170][DEBUG][markdown_it.rules_block.list] - entering list: StateBlock(line=0,level=0,tokens=0), 0, 1, False
[2023-09-16 00:20:07,170][DEBUG][markdown_it.rules_block.code] - entering code: StateBlock(line=0,level=2,tokens=2), 0, 1, False
[2023-09-16 00:20:07,170][DEBUG][markdown_it.rules_block.fence] - entering fence: StateBlock(line=0,level=2,tokens=2), 0, 1, False
[2023-09-16 00:20:07,171][DEBUG][markdown_it.rules_block.blockquote] - entering blockquote: StateBlock(line=0,level=2,tokens=2), 0, 1, False
[2023-09-16 00:20:07,171][DEBUG][markdown_it.rules_block.hr] - entering hr: StateBlock(line=0,level=2,tokens=2), 0, 1, False
[2023-09-16 00:20:07,171][DEBUG][markdown_it.rules_block.list] - entering list: StateBlock(line=0,level=2,tokens=2), 0, 1, False
[2023-09-16 00:20:07,172][DEBUG][markdown_it.rules_block.reference] - entering reference: StateBlock(line=0,level=2,tokens=2), 0, 1, False
[2023-09-16 00:20:07,172][DEBUG][markdown_it.rules_block.html_block] - entering html_block: StateBlock(line=0,level=2,tokens=2), 0, 1, False
[2023-09-16 00:20:07,172][DEBUG][markdown_it.rules_block.heading] - entering heading: StateBlock(line=0,level=2,tokens=2), 0, 1, False
[2023-09-16 00:20:07,173][DEBUG][markdown_it.rules_block.lheading] - entering lheading: StateBlock(line=0,level=2,tokens=2), 0, 1, False
[2023-09-16 00:20:07,173][DEBUG][markdown_it.rules_block.paragraph] - entering paragraph: StateBlock(line=0,level=2,tokens=2), 0, 1, False
[2023-09-16 00:20:07,281][DEBUG][markdown_it.rules_block.code] - entering code: StateBlock(line=0,level=0,tokens=0), 0, 1, False
[2023-09-16 00:20:07,281][DEBUG][markdown_it.rules_block.fence] - entering fence: StateBlock(line=0,level=0,tokens=0), 0, 1, False
[2023-09-16 00:20:07,282][DEBUG][markdown_it.rules_block.blockquote] - entering blockquote: StateBlock(line=0,level=0,tokens=0), 0, 1, False
[2023-09-16 00:20:07,282][DEBUG][markdown_it.rules_block.hr] - entering hr: StateBlock(line=0,level=0,tokens=0), 0, 1, False
[2023-09-16 00:20:07,283][DEBUG][markdown_it.rules_block.list] - entering list: StateBlock(line=0,level=0,tokens=0), 0, 1, False
[2023-09-16 00:20:07,283][DEBUG][markdown_it.rules_block.reference] - entering reference: StateBlock(line=0,level=0,tokens=0), 0, 1, False
[2023-09-16 00:20:07,283][DEBUG][markdown_it.rules_block.html_block] - entering html_block: StateBlock(line=0,level=0,tokens=0), 0, 1, False
[2023-09-16 00:20:07,284][DEBUG][markdown_it.rules_block.heading] - entering heading: StateBlock(line=0,level=0,tokens=0), 0, 1, False
[2023-09-16 00:20:07,284][DEBUG][markdown_it.rules_block.lheading] - entering lheading: StateBlock(line=0,level=0,tokens=0), 0, 1, False
[2023-09-16 00:20:07,284][DEBUG][markdown_it.rules_block.paragraph] - entering paragraph: StateBlock(line=0,level=0,tokens=0), 0, 1, False
CUDA SETUP: Loading binary D:\A1111\stable-diffusion-webui\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cudaall.dll...
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
[2023-09-16 00:20:08,558][DEBUG][api.py] - Loading Dreambooth API Endpoints.
Startup time: 225.0s (launcher: 137.5s, import torch: 8.6s, import gradio: 1.9s, setup paths: 1.7s, other imports: 2.6s, setup codeformer: 0.7s, list SD models: 0.8s, load scripts: 69.5s, create ui: 0.9s, gradio launch: 0.7s).
[2023-09-16 00:20:11,958][DEBUG][git.cmd] - Popen(['git', 'remote', 'get-url', '--all', 'origin'], cwd=D:\A1111\stable-diffusion-webui, universal_newlines=False, shell=None, istream=None)
[2023-09-16 00:20:12,031][DEBUG][git.cmd] - Popen(['git', 'cat-file', '--batch-check'], cwd=D:\A1111\stable-diffusion-webui, universal_newlines=False, shell=None, istream=<valid stream>)
[2023-09-16 00:20:12,087][DEBUG][git.cmd] - Popen(['git', 'cat-file', '--batch'], cwd=D:\A1111\stable-diffusion-webui, universal_newlines=False, shell=None, istream=<valid stream>)
[2023-09-16 00:20:12,151][DEBUG][git.cmd] - Popen(['git', 'remote', 'get-url', '--all', 'origin'], cwd=D:\A1111\stable-diffusion-webui, universal_newlines=False, shell=None, istream=None)
[2023-09-16 00:20:12,211][DEBUG][git.cmd] - Popen(['git', 'cat-file', '--batch-check'], cwd=D:\A1111\stable-diffusion-webui, universal_newlines=False, shell=None, istream=<valid stream>)
[2023-09-16 00:20:12,270][DEBUG][git.cmd] - Popen(['git', 'cat-file', '--batch'], cwd=D:\A1111\stable-diffusion-webui, universal_newlines=False, shell=None, istream=<valid stream>)
[2023-09-16 00:20:30,124][DEBUG][dreambooth.dataclasses.db_config] - Saving to D:\A1111\stable-diffusion-webui\models\dreambooth\DLAnimeL
[2023-09-16 00:20:30,390][DEBUG][dreambooth.dataclasses.db_config] - Saving to D:\A1111\stable-diffusion-webui\models\dreambooth\DLAnimeL
Initializing dreambooth training...
[2023-09-16 00:20:30,405][DEBUG][dreambooth.train_dreambooth] - Adding 'get_velocity' method to DEISMultistepScheduler...
[2023-09-16 00:20:30,405][DEBUG][dreambooth.train_dreambooth] - Adding 'get_velocity' method to UniPCMultistepScheduler...
                  [2023-09-16 00:20:30,539][DEBUG][dreambooth.train_dreambooth] - Pretrained path: D:\A1111\stable-diffusion-webui\models\dreambooth\DLAnimeL\working
Pre-processing images: LoraDavidLaid: : 20it [00:00, 790.60it/s]
Nothing to generate.s: LoraDavidLaid: : 0it [00:00, ?it/s]                                     | 0/220 [00:00<?, ?it/s]
                                                                                                                       Enabling xformers memory efficient attention for unet                                           | 0/220 [00:00<?, ?it/s]
Enabling xformers memory efficient attention for unet
                                                                                                                       Found 0 reg images. 0%|                                                                         | 0/220 [00:00<?, ?it/s]
Preparing dataset...
Init dataset!
Preparing Dataset (With Caching)
                                                                                                                       Loading cached latents...|                                                                       | 0/20 [00:00<?, ?it/s]
Bucket 0 (360, 720, 0) - Instance Images:  6 | Class Images: 0 | Max Examples/batch:  6
Bucket 1 (384, 680, 0) - Instance Images:  1 | Class Images: 0 | Max Examples/batch:  1
Bucket 2 (416, 624, 0) - Instance Images:  2 | Class Images: 0 | Max Examples/batch:  2
Bucket 3 (440, 584, 0) - Instance Images:  2 | Class Images: 0 | Max Examples/batch:  2
Bucket 4 (456, 568, 0) - Instance Images:  2 | Class Images: 0 | Max Examples/batch:  2
Bucket 5 (512, 512, 0) - Instance Images:  3 | Class Images: 0 | Max Examples/batch:  3
Bucket 6 (568, 456, 0) - Instance Images:  2 | Class Images: 0 | Max Examples/batch:  2
Bucket 7 (584, 440, 0) - Instance Images:  1 | Class Images: 0 | Max Examples/batch:  1
Bucket 8 (624, 416, 0) - Instance Images:  1 | Class Images: 0 | Max Examples/batch:  1
Total Buckets 9 - Instance Images: 20 | Class Images: 0 | Max Examples/batch: 20

Total images / batch: 20, total examples: 20██████████████████████████████████████████| 20/20 [00:00<00:00, 822.88it/s]
                                                                                                                       Total dataset length (steps): 20
                  Initializing bucket counter!
[2023-09-16 00:20:33,615][DEBUG][dreambooth.train_dreambooth] -   ***** Running training *****
[2023-09-16 00:20:33,615][DEBUG][dreambooth.train_dreambooth] -   Num batches each epoch = 20
[2023-09-16 00:20:33,616][DEBUG][dreambooth.train_dreambooth] -   Num Epochs = 150
[2023-09-16 00:20:33,616][DEBUG][dreambooth.train_dreambooth] -   Batch Size Per Device = 1
[2023-09-16 00:20:33,616][DEBUG][dreambooth.train_dreambooth] -   Gradient Accumulation steps = 1
[2023-09-16 00:20:33,616][DEBUG][dreambooth.train_dreambooth] -   Total train batch size (w. parallel, distributed & accumulation) = 1
[2023-09-16 00:20:33,617][DEBUG][dreambooth.train_dreambooth] -   Text Encoder Epochs: 0
[2023-09-16 00:20:33,617][DEBUG][dreambooth.train_dreambooth] -   Total optimization steps = 3000
[2023-09-16 00:20:33,618][DEBUG][dreambooth.train_dreambooth] -   Total training steps = 3000
[2023-09-16 00:20:33,618][DEBUG][dreambooth.train_dreambooth] -   Resuming from checkpoint: False
[2023-09-16 00:20:33,618][DEBUG][dreambooth.train_dreambooth] -   First resume epoch: 0
[2023-09-16 00:20:33,618][DEBUG][dreambooth.train_dreambooth] -   First resume step: 0
[2023-09-16 00:20:33,619][DEBUG][dreambooth.train_dreambooth] -   Lora: True, Optimizer: 8bit AdamW, Prec: fp16
[2023-09-16 00:20:33,619][DEBUG][dreambooth.train_dreambooth] -   Gradient Checkpointing: True
[2023-09-16 00:20:33,619][DEBUG][dreambooth.train_dreambooth] -   EMA: False
[2023-09-16 00:20:33,620][DEBUG][dreambooth.train_dreambooth] -   UNET: True
[2023-09-16 00:20:33,620][DEBUG][dreambooth.train_dreambooth] -   Freeze CLIP Normalization Layers: False
[2023-09-16 00:20:33,620][DEBUG][dreambooth.train_dreambooth] -   LR (Lora): 0.0001
[2023-09-16 00:20:33,621][DEBUG][dreambooth.train_dreambooth] -   LoRA Extended: True
[2023-09-16 00:20:33,621][DEBUG][dreambooth.train_dreambooth] -   V2: False
Steps:   0%|                                                                                  | 0/3000 [00:00<?, ?it/s]Traceback (most recent call last):
  File "D:\A1111\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\ui_functions.py", line 729, in start_training
    result = main(class_gen_method=class_gen_method)
  File "D:\A1111\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\train_dreambooth.py", line 1552, in main
    return inner_loop()
  File "D:\A1111\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\memory.py", line 119, in decorator
    return function(batch_size, grad_size, prof, *args, **kwargs)
  File "D:\A1111\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\train_dreambooth.py", line 1286, in inner_loop
    noise_pred = unet(
  File "D:\A1111\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\A1111\stable-diffusion-webui\venv\lib\site-packages\accelerate\utils\operations.py", line 581, in forward
    return model_forward(*args, **kwargs)
  File "D:\A1111\stable-diffusion-webui\venv\lib\site-packages\accelerate\utils\operations.py", line 569, in __call__
    return convert_to_fp32(self.model_forward(*args, **kwargs))
  File "D:\A1111\stable-diffusion-webui\venv\lib\site-packages\torch\amp\autocast_mode.py", line 14, in decorate_autocast
    return func(*args, **kwargs)
  File "D:\A1111\stable-diffusion-webui\venv\lib\site-packages\diffusers\models\unet_2d_condition.py", line 915, in forward
    sample, res_samples = downsample_block(
  File "D:\A1111\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\A1111\stable-diffusion-webui\venv\lib\site-packages\diffusers\models\unet_2d_blocks.py", line 977, in forward
    hidden_states = torch.utils.checkpoint.checkpoint(
  File "D:\A1111\stable-diffusion-webui\venv\lib\site-packages\torch\utils\checkpoint.py", line 251, in checkpoint
    return _checkpoint_without_reentrant(
  File "D:\A1111\stable-diffusion-webui\venv\lib\site-packages\torch\utils\checkpoint.py", line 432, in _checkpoint_without_reentrant
    output = function(*args, **kwargs)
  File "D:\A1111\stable-diffusion-webui\venv\lib\site-packages\diffusers\models\unet_2d_blocks.py", line 972, in custom_forward
    return module(*inputs)
  File "D:\A1111\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\A1111\stable-diffusion-webui\venv\lib\site-packages\diffusers\models\resnet.py", line 612, in forward
    hidden_states = self.conv1(hidden_states)
  File "D:\A1111\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\A1111\stable-diffusion-webui\extensions\sd_dreambooth_extension\lora_diffusion\lora.py", line 34, in forward
    self.linear(input)
  File "D:\A1111\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\A1111\stable-diffusion-webui\extensions-builtin\Lora\networks.py", line 361, in network_Linear_forward
    return torch.nn.Linear_forward_before_network(self, input)
  File "D:\A1111\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\linear.py", line 114, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (18240x71 and 320x320)
Steps:   0%|                                                                                  | 0/3000 [00:00<?, ?it/s]
Duration: 00:00:04
Restored system models.
Duration: 00:00:05

Additional information

none

d8ahazard / sd_dreambooth_extension