bmaltais / kohya_ss

Apache License 2.0
9.53k stars 1.23k forks source link

LyCORIS Dylora AttributeError apply_max_norm_regularization #2215

Closed SmirkingKitsune closed 6 months ago

SmirkingKitsune commented 6 months ago

I am having trouble running the LyCORIS Dylora adaption algorithm on an SD v1.5 checkpoint in GUI v23.0.15. I am using this on a Windows 11 with RTX 3070. The environment uses Torch 2.1.2+cu118, CUDA 11.8, cuDNN 8905, triton 2.1.0, and Python 3.10.11.

When I try to use LyCORIS Dylora I get the following error:

2024-04-05 13:17:07|[LyCORIS]-INFO: Using rank adaptation algo: dylora 2024-04-05 13:17:07|[LyCORIS]-INFO: Use Dropout value: 0.0 2024-04-05 13:17:07|[LyCORIS]-INFO: Create LyCORIS Module 2024-04-05 13:17:08|[LyCORIS]-INFO: create LyCORIS for Text Encoder: 72 modules. 2024-04-05 13:17:08|[LyCORIS]-INFO: Create LyCORIS Module 2024-04-05 13:17:09|[LyCORIS]-INFO: create LyCORIS for U-Net: 390 modules. 2024-04-05 13:17:09|[LyCORIS]-INFO: module type table: {'DyLoraModule': 354, 'NormModule': 108} Traceback (most recent call last): File "Z:\ProgramData\kohya_ss\sd-scripts\train_network.py", line 1058, in trainer.train(args) File "Z:\ProgramData\kohya_ss\sd-scripts\train_network.py", line 300, in train network = network_module.create_network( File "Z:\ProgramData\kohya_ss\venv\lib\site-packages\lycoris\kohya__init.py", line 148, in create_network delattr(type(network), "apply_max_norm_regularization") AttributeError: apply_max_norm_regularization Traceback (most recent call last): File "C:\Users\sd\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\sd\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "Z:\ProgramData\kohya_ss\venv\Scripts\accelerate.exe\main__.py", line 7, in File "Z:\ProgramData\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 47, in main args.func(args) File "Z:\ProgramData\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1017, in launch_command simple_launcher(args) File "Z:\ProgramData\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 637, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['Z:\ProgramData\kohya_ss\venv\Scripts\python.exe', 'Z:\ProgramData\kohya_ss/sd-scripts/train_network.py', '--max_grad_norm=0', '--bucket_reso_steps=1', '--cache_latents', '--cache_latents_to_disk', '--caption_extension=.txt', '--enable_bucket', '--min_bucket_reso=256', '--max_bucket_reso=2048', '--flip_aug', '--learning_rate=4e-07', '--logging_dir=Z:\Stable_Diffusion_Training\SD_Scripts\Dev\log', '--lr_scheduler=constant_with_warmup', '--lr_scheduler_num_cycles=5', '--lr_warmup_steps=13012', '--max_data_loader_n_workers=0', '--resolution=512,512', '--max_token_length=225', '--max_train_steps=162650', '--min_snr_gamma=10', '--mixed_precision=bf16', '--network_alpha=64', '--network_args', 'preset=full', 'conv_dim=64', 'conv_alpha=64', 'use_tucker=False', 'block_size=1', 'rank_dropout=0', 'module_dropout=0', 'algo=dylora', 'train_norm=True', '--network_dim=64', '--network_module=lycoris.kohya', '--network_train_unet_only', '--noise_offset=0.0357', '--optimizer_args', 'scale_parameter=False', 'relative_step=False', 'warmup_init=False', '--optimizer_type=Adafactor', '--output_dir=Z:\Stable_Diffusion_Training\SD_Scripts\Dev\model', '--output_name=DevV3.1', '--pretrained_model_name_or_path=Z:/ProgramData/Stable-Diffusion-webui/webui/models/Stable-diffusion/CustomMix_v3.safetensors', '--save_every_n_epochs=1', '--save_model_as=safetensors', '--save_precision=fp16', '--seed=1234', '--shuffle_caption', '--train_batch_size=1', '--training_comment=V3.1', '--train_data_dir=Z:\Stable_Diffusion_Training\SD_Scripts\Dev\img', '--unet_lr=4e-07', '--xformers', '--sample_sampler=euler_a', '--sample_prompts=Z:\Stable_Diffusion_Training\SD_Scripts\Dev\model\sample\prompt.txt', '--sample_every_n_epochs=1']' returned non-zero exit status 1.

I have tried reinstalling the venv but got the same result. For both attempts, I have used bitsandbytes 0.43.0 or bitsandbytes-windows.

I have tried adding an if statement to the lycoris_lora 2.2.0.post3 module to prevent it from calling delattr(type(network), "apply_max_norm_regularization") with the following if statement: if hasattr(type(network), "apply_max_norm_regularization"): delattr(type(network), "apply_max_norm_regularization") However, I still got the same error.

I have also tried running SD-Scripts v0.8.4 on GUI v23.0.15 by disabling git and manually replacing SD-Scripts v0.8.5 with a copy of SD-Scripts v0.8.4, this resulted in the same error. I was previously using SD-Scripts v0.8.4 on GUI v22.6.2 and that worked fine, so I don't think it is a SD-Scripts problem.

I did not experience this issue when using Kohya Dylora, only LyCORIS Dylora.

bmaltais commented 6 months ago

OK, I have been able to reproduce the issue and it appear to be a bug introduced in a newer version of lycoris_lora that is used in the GUI version v22.6.2. THe latest GUI use the latest version of lycoris_lora and this is where this traceback is coming from. Perhaps @KohakuBlueleaf can do comething about it. I am not sure I can fis this with a GUI parameter... but maybe I can? I will do a quick code check for this parameter name and see if it can be passed to the network... but I doubt...

You may want to open an issue directly on @KohakuBlueleaf repo for Lycoris... https://github.com/KohakuBlueleaf/LyCORIS/issues

bmaltais commented 6 months ago

To work around the compatibility issue with the dylora algorithm, which currently lacks support for scale weight normalization, you can adjust the module's initialization code as follows:

if algo == "dylora":
    # Attempt to remove the unsupported feature from the `network` class
    try:
        delattr(type(network), "apply_max_norm_regularization")
    except AttributeError:
        # If the attribute doesn't exist, do nothing
        pass

This adjustment ensures the code executes correctly by removing the unsupported apply_max_norm_regularization feature when using the dylora algorithm.

If you're currently using an older version of the GUI and encountering this issue, you have two options:

  1. Continue using the older GUI until @KohakuBlueleaf addresses this compatibility issue in his module.
  2. Upgrade to the latest GUI and Lycoris code, applying the aforementioned code modification to the module's initialization file (init). However, please be aware that this is a temporary solution and may lead to complications when updating the Lycoris module in the future. Once an updated version of the module that resolves this issue is released, you'll need to revert the changes made to the initialization code to ensure smooth installation.

Caution: Opting for the temporary fix to use the latest GUI features alongside the Lycoris code necessitates vigilance regarding future module updates. It's important to revert the manual adjustments to the initialization file before installing any updates to avoid potential installation issues.

SmirkingKitsune commented 6 months ago

Okay, I now understand that I made a mistake when I troubleshooted the lycoris_lora module. Your workaround for lycoris_lora seems to work. I have also tested GUI v23.0.15 using the lycoris_lora 2.0.2 module, the lycoris_lora module which GUI v22.6.2 uses, instead of lycoris_lora 2.2.0.post3, this also seemed to work.

Therefore, it is clear that the issue is not with GUI v23.0.15. I will open the issue with the LyCORIS repo. Closing the issue as 'closed'.