kohya-ss / sd-scripts

Apache License 2.0
5.31k stars 880 forks source link

Merge of Flux lycoris.kohya LoRAs fails with KeyError: 'lora_te1_text_model_encoder_layers_0_mlp_fc1.hada_w1_' #1753

Closed envy-ai closed 2 weeks ago

envy-ai commented 3 weeks ago

I'm running the latest pull of the sd3_5_support branch as of this issue report.

Both LoRAs are tested and working. Here's the command line I used to train them (note that the commands are essentially identical except for the lora name and training data directory):

accelerate launch --mixed_precision bf16 --num_cpu_threads_per_process 1 sd-scripts/flux_train_network.py --pretrained_model_name_or_path "C:\AI\ComfyUI\models\checkpoints\FLUX1\colossusProjectFlux_21DeDistilledFP8UNET.safetensors" --clip_l "C:\AI\ComfyUI\models\clip\colossusProjectFlux_clipLV21.safetensors" --t5xxl "C:\AI\ComfyUI\models\clip\t5\google_t5-v1_1-xxl_encoderonly-fp8_e4m3fn.safetensors" --ae "C:/AI/ComfyUI/models/flux/ae.sft" --cache_latents_to_disk --save_model_as safetensors --xformers --persistent_data_loader_workers --max_data_loader_n_workers 2 --seed 42 --gradient_checkpointing --mixed_precision bf16 --save_precision bf16 --resolution 512,512 --network_module lycoris.kohya --network_dim 8 --optimizer_type Prodigy --learning_rate 1 --cache_text_encoder_outputs --cache_text_encoder_outputs_to_disk --fp8_base --highvram --max_train_epochs 28 --save_every_n_epochs 1 --sample_every_n_epochs 1 --sample_prompts "C:/AI/automatic/models/Lora/anime2.txt" --sample_sampler "euler" --output_dir D:\ai\models\lora --output_name anistyle1 --timestep_sampling sigmoid --model_prediction_type raw --guidance_scale 1.0 --loss_type l2 --train_data_dir C:\AI\training_data\anistyle1_flux\ --train_batch_size 7 --network_alpha 32 --caption_extension txt --network_args "algo=loha" "preset=full" "factor=8" "decompose_both=True" "full_matrix=True"

Here's the full error message:

(kohya_flux) PS C:\AI\kohya_flux\sd-scripts> python networks\flux_merge_lora.py --save_to D:\ai\models\lora\EnvyFluxAnime02.safetensors --models D:\ai\models\lora\anistyle2-000003.safetensors D:\ai\models\lora\anistyle1-000002.safetensors --ratios 1 1
C:\Users\aaa\anaconda3\envs\kohya_flux\lib\site-packages\diffusers\utils\outputs.py:63: FutureWarning: `torch.utils._pytree._register_pytree_node` is deprecated. Please use `torch.utils._pytree.register_pytree_node` instead.
  torch.utils._pytree._register_pytree_node(
ytree._register_pytree_node` is deprecated. Please use `torch.utils._pytree.register_pytree_node` instead.
  torch.utils._pytree._register_pytree_node(
C:\Users\aaa\anaconda3\envs\kohya_flux\lib\site-packages\diffusers\utils\outputs.py:63: FutureWarning: `torch.utils._pytree._register_pytree_node` is deprecated. Please use `torch.utils._pytree.register_pytree_node` instead.
  torch.utils._pytree._register_pytree_node(
2024-11-02 20:33:14.186900: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-11-02 20:33:14.889338: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
C:\Users\aaa\anaconda3\envs\kohya_flux\lib\site-packages\diffusers\utils\outputs.py:63: FutureWarning: `torch.utils._pytree._register_pytree_node` is deprecated. Please use `torch.utils._pytree.register_pytree_node` instead.
  torch.utils._pytree._register_pytree_node(
2024-11-02 20:33:16 INFO     loading: D:\ai\models\lora\anistyle2-000003.safetensors              flux_merge_lora.py:453
                    INFO     dim: [], alpha: [32.0]                                               flux_merge_lora.py:484
                    INFO     merging...                                                           flux_merge_lora.py:487
  0%|                                                                                         | 1/1640 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "C:\AI\kohya_flux\sd-scripts\networks\flux_merge_lora.py", line 765, in <module>
    merge(args)
  File "C:\AI\kohya_flux\sd-scripts\networks\flux_merge_lora.py", line 637, in merge
    flux_state_dict, metadata = merge_lora_models(args.models, args.ratios, merge_dtype, args.concat, args.shuffle)
  File "C:\AI\kohya_flux\sd-scripts\networks\flux_merge_lora.py", line 501, in merge_lora_models
    base_alpha = base_alphas[lora_module_name]
KeyError: 'lora_te1_text_model_encoder_layers_0_mlp_fc1.hada_w1_'

When I tried svd_merge_lora.py, it completed without error, but gave me an empty LoRA of size ~1kb.

(kohya_flux) PS C:\AI\kohya_flux\sd-scripts> python networks\svd_merge_lora.py --save_to D:\ai\models\lora\EnvyFluxAnime02.safetensors --models D:\ai\models\lora\anistyle2-000003.safetensors D:\ai\models\lora\anistyle1-000002.safetensors --ratios 1 1 --new_rank 8
C:\Users\aaa\anaconda3\envs\kohya_flux\lib\site-packages\diffusers\utils\outputs.py:63: FutureWarning: `torch.utils._pytree._register_pytree_node` is deprecated. Please use `torch.utils._pytree.register_pytree_node` instead.
  torch.utils._pytree._register_pytree_node(
2024-11-02 20:41:02.934685: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-11-02 20:41:03.642024: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
C:\Users\aaa\anaconda3\envs\kohya_flux\lib\site-packages\diffusers\utils\outputs.py:63: FutureWarning: `torch.utils._pytree._register_pytree_node` is deprecated. Please use `torch.utils._pytree.register_pytree_node` instead.
  torch.utils._pytree._register_pytree_node(
2024-11-02 20:41:06 INFO     new rank: 8, new conv rank: 8                                         svd_merge_lora.py:254
                    INFO     loading: D:\ai\models\lora\anistyle2-000003.safetensors               svd_merge_lora.py:266
                    INFO     merging...                                                            svd_merge_lora.py:282
100%|██████████████████████████████████████████████████████████████████████████████████████| 1640/1640 [00:00<?, ?it/s]
                    INFO     loading: D:\ai\models\lora\anistyle1-000002.safetensors               svd_merge_lora.py:266
                    INFO     merging...                                                            svd_merge_lora.py:282
100%|██████████████████████████████████████████████████████████████████████████████████████| 1640/1640 [00:00<?, ?it/s]
                    INFO     extract new lora...                                                   svd_merge_lora.py:342
0it [00:00, ?it/s]
                    INFO     calculating hashes and creating metadata...                           svd_merge_lora.py:437
                    INFO     saving model to: D:\ai\models\lora\EnvyFluxAnime02.safetensors        svd_merge_lora.py:457
kohya-ss commented 2 weeks ago

Unfortunately merging scripts in this repo doesn't support network modules from LyCORIS. LyCORIS repo may have scripts.