kohya-ss / sd-scripts

Apache License 2.0
5.33k stars 881 forks source link

ValueError:Cannot load xxxx from flux/vae because the following keys are missing #1701

Open JingZz7 opened 1 month ago

JingZz7 commented 1 month ago

This is my error report: Loading pipeline components...: 0%| | 0/3 [00:00<?, ?it/s]accelerator device: cuda:6 The config attributes {'latents_mean': None, 'latents_std': None, 'mid_block_add_attention': True, 'shift_factor': 0.1159, 'use_post_quant_conv': False, 'use_quant_conv': False} were passed to AutoencoderKL, but are not expected and will be ignored. Please verify your config.json configuration file. Loading pipeline components...: 0%| | 0/3 [00:00<?, ?it/s] Traceback (most recent call last): File "/home/ma-user/sd-scripts/sd-scripts/train_network.py", line 1117, in <module> trainer.train(args) File "/home/ma-user/sd-scripts/sd-scripts/train_network.py", line 236, in train model_version, text_encoder, vae, unet = self.load_target_model(args, weight_dtype, accelerator) File "/home/ma-user/sd-scripts/sd-scripts/train_network.py", line 103, in load_target_model text_encoder, vae, unet, _ = train_util.load_target_model(args, weight_dtype, accelerator) File "/home/ma-user/sd-scripts/sd-scripts/library/train_util.py", line 4387, in load_target_model text_encoder, vae, unet, load_stable_diffusion_format = _load_target_model( File "/home/ma-user/sd-scripts/sd-scripts/library/train_util.py", line 4349, in _load_target_model accelerator device: cuda:2 pipe = StableDiffusionPipeline.from_pretrained(name_or_path, tokenizer=None, safety_checker=None) File "/home/ma-user/anaconda3/envs/PyTorch-2.2.0/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn return fn(*args, **kwargs) File "/home/ma-user/anaconda3/envs/PyTorch-2.2.0/lib/python3.10/site-packages/diffusers/pipelines/pipeline_utils.py", line 1271, in from_pretrained loaded_sub_model = load_sub_model( File "/home/ma-user/anaconda3/envs/PyTorch-2.2.0/lib/python3.10/site-packages/diffusers/pipelines/pipeline_utils.py", line 525, in load_sub_model loaded_sub_model = load_method(os.path.join(cached_folder, name), **loading_kwargs) File "/home/ma-user/anaconda3/envs/PyTorch-2.2.0/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn return fn(*args, **kwargs) File "/home/ma-user/anaconda3/envs/PyTorch-2.2.0/lib/python3.10/site-packages/diffusers/models/modeling_utils.py", line 805, in from_pretrained raise ValueError( ValueError: Cannot load <class 'diffusers.models.autoencoders.autoencoder_kl.AutoencoderKL'> from /home_host/zj/flux/ai-modelscope/flux___1-dev/vae because the following keys are missing: quant_conv.bias, post_quant_conv.weight, quant_conv.weight, post_quant_conv.bias. Please make sure to passlow_cpu_mem_usage=Falseanddevice_map=Noneif you want to randomly initialize those weights or else make sure your checkpoint file is correct. [ERROR] 2024-10-16-10:14:27 (PID:3362, Device:0, RankID:0) ERR99999 UNKNOWN application exception

Below is the model I used to train: https://huggingface.co/black-forest-labs/FLUX.1-dev/tree/main

This is my executed command: accelerate launch --num_cpu_threads_per_process 1 train_network.py \ --pretrained_model_name_or_path="/home_host/zj/flux/ai-modelscope/flux___1-dev" \ --dataset_config="/home/ma-user/sd-scripts/sd-scripts/train_first.toml" \ --output_dir="/home/ma-user/sd-scripts/sd-scripts/output" \ --output_name="output" \ --save_model_as=safetenpip isors \ --prior_loss_weight=1.0 \ --max_train_steps=3000 \ --learning_rate=1e-4 \ --optimizer_type="AdamW" \ --xformers \ --mixed_precision="fp16" \ --cache_latents \ --save_every_n_epochs=3000 \ --network_module=networks.lora

How should I solve this problem? Thank you in advance for your answers

JingZz7 commented 1 month ago

The error message is not very clear. This is the message I copied again:

The config attributes {'latents_mean': None, 'latents_std': None, 'mid_block_add_attention': True, 'shift_factor': 0.1159, 'use_post_quant_conv': False, 'use_quant_conv': False} were passed to AutoencoderKL, but are not expected and will be ignored. Please verify your config.json configuration file.
Loading pipeline components...:   0%|                                                                                                                                                            | 0/3 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/home/ma-user/sd-scripts/sd-scripts/train_network.py", line 1117, in <module>
    trainer.train(args)
  File "/home/ma-user/sd-scripts/sd-scripts/train_network.py", line 236, in train
    model_version, text_encoder, vae, unet = self.load_target_model(args, weight_dtype, accelerator)
  File "/home/ma-user/sd-scripts/sd-scripts/train_network.py", line 103, in load_target_model
    text_encoder, vae, unet, _ = train_util.load_target_model(args, weight_dtype, accelerator)
  File "/home/ma-user/sd-scripts/sd-scripts/library/train_util.py", line 4387, in load_target_model
    text_encoder, vae, unet, load_stable_diffusion_format = _load_target_model(
  File "/home/ma-user/sd-scripts/sd-scripts/library/train_util.py", line 4349, in _load_target_model
accelerator device: cuda:2
    pipe = StableDiffusionPipeline.from_pretrained(name_or_path, tokenizer=None, safety_checker=None)
  File "/home/ma-user/anaconda3/envs/PyTorch-2.2.0/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
  File "/home/ma-user/anaconda3/envs/PyTorch-2.2.0/lib/python3.10/site-packages/diffusers/pipelines/pipeline_utils.py", line 1271, in from_pretrained
    loaded_sub_model = load_sub_model(
  File "/home/ma-user/anaconda3/envs/PyTorch-2.2.0/lib/python3.10/site-packages/diffusers/pipelines/pipeline_utils.py", line 525, in load_sub_model
    loaded_sub_model = load_method(os.path.join(cached_folder, name), **loading_kwargs)
  File "/home/ma-user/anaconda3/envs/PyTorch-2.2.0/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
  File "/home/ma-user/anaconda3/envs/PyTorch-2.2.0/lib/python3.10/site-packages/diffusers/models/modeling_utils.py", line 805, in from_pretrained
    raise ValueError(
ValueError: Cannot load <class 'diffusers.models.autoencoders.autoencoder_kl.AutoencoderKL'> from /home_host/zj/flux/ai-modelscope/flux___1-dev/vae because the following keys are missing:
 quant_conv.bias, post_quant_conv.weight, quant_conv.weight, post_quant_conv.bias.
 Please make sure to pass `low_cpu_mem_usage=False` and `device_map=None` if you want to randomly initialize those weights or else make sure your checkpoint file is correct.
[ERROR] 2024-10-16-10:14:27 (PID:3362, Device:0, RankID:0) ERR99999 UNKNOWN application exception
kohya-ss commented 1 month ago

Please use flux_train_network.py instead of train_network.py for FLUX.1 LoRA training.

qiqiApink commented 1 month ago

Please use flux_train_network.py instead of train_network.py for FLUX.1 LoRA training.

Where is flux_train_network.py? I can't find it!

JingZz7 commented 1 month ago

请使用flux_train_network.py而不是train_network.pyFLUX.1 LoRA 训练。

在哪里flux_train_network.py?我找不到!

You should switch to the branch of sd3