[Bug]: lycoris BOFT networks doesn't work with lycoris 2.1.0.post2 version

Exist-c commented 8 months ago

Checklist

[ ] The issue exists after disabling all extensions
[ ] The issue exists on a clean installation of webui
[ ] The issue is caused by an extension, but I believe it is caused by a bug in the webui
[X] The issue exists in the current version of the webui
[X] The issue has not been reported before recently
[ ] The issue has been reported before but has not been fixed yet

What happened?

I trained a Lycoris Boft Lora using the kohya_ss/sd-script, and the Lycoris version is 2.1.0.post2. When using the webui for inference, the command line returns an error.

Traceback (most recent call last): File "D:\stable-diffusion-webui\extensions-builtin\Lora\networks.py", line 280, in load_networks net = load_network(name, network_on_disk) File "D:\stable-diffusion-webui\extensions-builtin\Lora\networks.py", line 219, in load_network net_module = nettype.create_module(net, weights) File "D:\stable-diffusion-webui\extensions-builtin\Lora\network_oft.py", line 10, in create_module return NetworkModuleOFT(net, weights) File "D:\stable-diffusion-webui\extensions-builtin\Lora\network_oft.py", line 32, in init self.alpha = weights.w["alpha"] # alpha is constraint KeyError: 'alpha'

After checking the code of network_oft.py, I guess that the code of Lycoris-Lora has been updated, which makes it unable to distinguish between the kohya-oft and Lycoris-oft

kohya-ss

if "oft_blocks" in weights.w.keys(): self.is_kohya = True self.oft_blocks = weights.w["oft_blocks"] # (num_blocks, block_size, block_size) self.alpha = weights.w["alpha"] # alpha is constraint self.dim = self.oft_blocks.shape[0] # lora dim

LyCORIS OFT

elif "oft_diag" in weights.w.keys(): self.oft_blocks = weights.w["oft_diag"]

self.alpha is unused

self.dim = self.oft_blocks.shape[1] # (num_blocks, block_size, block_size)

        # LyCORIS BOFT
        if weights.w["oft_diag"].dim() == 4:
            self.is_boft = True

So I tried using the Lycoris 2.1.0.dev9 version with the same parameters to train a new Lycoris Boft Lora, and it worked normally.

Steps to reproduce the problem

train a lycoris boft lora with parameter use kohya_ss/sd-script

accelerate launch --num_cpu_threads_per_process=2 "./train_network.py" --enable_bucket --min_bucket_reso=256 --max_bucket_reso=2048 --pretrained_model_name_or_path="D:\stable-diffusion-webui\models\Stable-diffusion\animefull-final-pruned-fp16.safetensors" --train_data_dir="E:/artist/test" --resolution="720,720" --output_dir="D:/stable-diffusion-webui/models/lora" --logging_dir="D:/stable-diffusion-webui/models/lora/log" --save_model_as=safetensors --network_module="lycoris.kohya" --network_args "preset=attn-mlp" "algo=boft" --network_dim="32" --output_name="boft-test-2" --lr_scheduler="REX" --train_batch_size="1" --mixed_precision="bf16" --save_precision="bf16" --seed="114514" --caption_extension=".txt" --cache_latents --optimizer_type="Prodigy" --max_grad_norm="1" --max_train_epochs="6" --max_data_loader_n_workers="2" --max_token_length=225 --clip_skip=2 --bucket_reso_steps=2 --xformers --persistent_data_loader_workers --bucket_no_upscale --noise_offset=0.0 --tokenizer_cache_dir D:\clip-vit-L --vae="D:/stable-diffusion-webui/models/VAE/animal.pt" --save_every_n_epochs="1" --learning_rate="1" --vae_batch_size="8" --gradient_checkpoint
generate a image

What should have happened?

works like a normal lora

What browsers do you use to access the UI ?

Microsoft Edge

Sysinfo

sysinfo-2024-02-20-04-52.json

Console logs

(D:\Env\cuda12.1) D:\stable-diffusion-webui>python webui.py --xformers
Tag Autocomplete: Could not locate model-keyword extension, Lora trigger word completion will be limited to those added through the extra networks menu.
[-] ADetailer initialized. version: 24.1.2, num models: 20
ControlNet preprocessor location: D:\stable-diffusion-webui\extensions\sd-webui-controlnet\annotator\downloads
2024-02-20 11:01:33,065 - ControlNet - INFO - ControlNet v1.1.440
2024-02-20 11:01:33,479 - ControlNet - INFO - ControlNet v1.1.440
[sd-webui-freeu] Controlnet support: *enabled*
== WD14 tagger /gpu:0, uname_result(system='Windows', node='DESKTOP-H8FC692', release='10', version='10.0.19045', machine='AMD64') ==
Loading weights [319bd9f4fa] from D:\stable-diffusion-webui\models\Stable-diffusion\animefull-final-pruned-fp16.safetensors
Creating model from config: D:\stable-diffusion-webui\configs\v1-inference.yaml
2024-02-20 11:01:36,254 - ControlNet - INFO - ControlNet UI callback registered.
Loading VAE weights specified in settings: D:\stable-diffusion-webui\models\VAE\animal.pt
D:\stable-diffusion-webui\extensions\sd-webui-check-tensors\scripts\check-tensors.py:21: GradioDeprecationWarning: The `style` method is deprecated. Please set these arguments in the constructor instead.
  with gr.Row().style(equal_height=False):
Applying attention optimization: xformers... done.
Running on local URL:  http://127.0.0.1:7860
Model loaded in 5.0s (load weights from disk: 0.4s, create model: 1.6s, apply weights to model: 1.4s, load VAE: 0.8s, load textual inversion embeddings: 0.5s, calculate empty prompt: 0.2s).

To create a public link, set `share=True` in `launch()`.
Startup time: 31.3s (import torch: 11.6s, import gradio: 1.2s, setup paths: 0.9s, initialize shared: 0.2s, other imports: 1.1s, list SD models: 0.2s, load scripts: 8.6s, create ui: 4.5s, gradio launch: 3.0s).
loading network D:\stable-diffusion-webui\models\lora\boft-test-2-000002.safetensors: KeyError
Traceback (most recent call last):
  File "D:\stable-diffusion-webui\extensions-builtin\Lora\networks.py", line 280, in load_networks
    net = load_network(name, network_on_disk)
  File "D:\stable-diffusion-webui\extensions-builtin\Lora\networks.py", line 219, in load_network
    net_module = nettype.create_module(net, weights)
  File "D:\stable-diffusion-webui\extensions-builtin\Lora\network_oft.py", line 10, in create_module
    return NetworkModuleOFT(net, weights)
  File "D:\stable-diffusion-webui\extensions-builtin\Lora\network_oft.py", line 32, in __init__
    self.alpha = weights.w["alpha"] # alpha is constraint
KeyError: 'alpha'

100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:03<00:00,  5.18it/s]
Total progress: 100%|██████████████████████████████████████████████████████████████████| 20/20 [00:04<00:00,  4.98it/s]
Interrupted with signal 2 in <frame at 0x00000264789DF750, file 'D:\\Env\\cuda12.1\\lib\\threading.py', line 324, code wait>

Additional information

No response

w-e-w commented 8 months ago

@KohakuBlueleaf

KohakuBlueleaf commented 8 months ago

New LyCORIS save oft blocks by default so user can resume Will update the comment and use weights.w.get

KohakuBlueleaf commented 8 months ago

@Exist-c Both built-in lora system and LyCORIS need some modification for this case Once I push the fix, you may need to rerun your training or manually write the alpha value (constrain) into your state dict

Exist-c commented 8 months ago

Thank you, now it works

AUTOMATIC1111 / stable-diffusion-webui