Linaqruf / kohya-trainer

Adapted from https://note.com/kohya_ss/n/nbf7ce8d80f29 for easier cloning
Apache License 2.0
1.83k stars 300 forks source link

Possible model load error. (KeyError: 'time_embed.0.weight') #239

Open Hashizh opened 1 year ago

Hashizh commented 1 year ago

I tried checking if it was similar to the pipe issue but no matter which model I switched I got the same error. even models that once worked don't anymore

╭───────────────────── Traceback (most recent call last) ──────────────────────╮ │ /content/kohya-trainer/train_network.py:752 in │ │ │ │ 749 │ args = parser.parse_args() │ │ 750 │ args = train_util.read_config_from_file(args, parser) │ │ 751 │ │ │ ❱ 752 │ train(args) │ │ 753 │ │ │ │ /content/kohya-trainer/train_network.py:152 in train │ │ │ │ 149 │ │ if pi == accelerator.state.local_process_index: │ │ 150 │ │ │ print(f"loading model for process {accelerator.state.local │ │ 151 │ │ │ │ │ ❱ 152 │ │ │ textencoder, vae, unet, = train_util.load_target_model( │ │ 153 │ │ │ │ args, weight_dtype, accelerator.device if args.lowram │ │ 154 │ │ │ ) │ │ 155 │ │ │ │ /content/kohya-trainer/library/train_util.py:2739 in load_target_model │ │ │ │ 2736 │ load_stable_diffusion_format = os.path.isfile(name_or_path) # de │ │ 2737 │ if load_stable_diffusion_format: │ │ 2738 │ │ print("load StableDiffusion checkpoint") │ │ ❱ 2739 │ │ text_encoder, vae, unet = model_util.load_models_fromstable │ │ 2740 │ else: │ │ 2741 │ │ # Diffusers model is loaded to CPU │ │ 2742 │ │ print("load Diffusers pretrained models") │ │ │ │ /content/kohya-trainer/library/model_util.py:854 in │ │ load_models_from_stable_diffusion_checkpoint │ │ │ │ 851 │ │ │ 852 │ # Convert the UNet2DConditionModel model. │ │ 853 │ unet_config = create_unet_diffusers_config(v2) │ │ ❱ 854 │ converted_unet_checkpoint = convert_ldm_unet_checkpoint(v2, state │ │ 855 │ │ │ 856 │ unet = UNet2DConditionModel(**unet_config).to(device) │ │ 857 │ info = unet.load_state_dict(converted_unet_checkpoint) │ │ │ │ /content/kohya-trainer/library/model_util.py:234 in │ │ convert_ldm_unet_checkpoint │ │ │ │ 231 │ │ │ 232 │ new_checkpoint = {} │ │ 233 │ │ │ ❱ 234 │ new_checkpoint["time_embedding.linear_1.weight"] = unet_state_dic │ │ 235 │ new_checkpoint["time_embedding.linear_1.bias"] = unet_state_dict[ │ │ 236 │ new_checkpoint["time_embedding.linear_2.weight"] = unet_state_dic │ │ 237 │ new_checkpoint["time_embedding.linear_2.bias"] = unet_state_dict[ │ ╰──────────────────────────────────────────────────────────────────────────────╯ KeyError: 'time_embed.0.weight' ╭───────────────────── Traceback (most recent call last) ──────────────────────╮ │ /usr/local/bin/accelerate:8 in │ │ │ │ 5 from accelerate.commands.accelerate_cli import main │ │ 6 if name == 'main': │ │ 7 │ sys.argv[0] = re.sub(r'(-script.pyw|.exe)?$', '', sys.argv[0]) │ │ ❱ 8 │ sys.exit(main()) │ │ 9 │ │ │ │ /usr/local/lib/python3.10/dist-packages/accelerate/commands/accelerate_cli.p │ │ y:45 in main │ │ │ │ 42 │ │ exit(1) │ │ 43 │ │ │ 44 │ # Run │ │ ❱ 45 │ args.func(args) │ │ 46 │ │ 47 │ │ 48 if name == "main": │ │ │ │ /usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py:1104 │ │ in launch_command │ │ │ │ 1101 │ elif defaults is not None and defaults.compute_environment == Com │ │ 1102 │ │ sagemaker_launcher(defaults, args) │ │ 1103 │ else: │ │ ❱ 1104 │ │ simple_launcher(args) │ │ 1105 │ │ 1106 │ │ 1107 def main(): │ │ │ │ /usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py:567 in │ │ simple_launcher │ │ │ │ 564 │ process = subprocess.Popen(cmd, env=current_env) │ │ 565 │ process.wait() │ │ 566 │ if process.returncode != 0: │ │ ❱ 567 │ │ raise subprocess.CalledProcessError(returncode=process.return │ │ 568 │ │ 569 │ │ 570 def multi_gpu_launcher(args): │ ╰──────────────────────────────────────────────────────────────────────────────╯ CalledProcessError: Command '['/usr/bin/python3', 'train_network.py', '--sample_prompts=/content/LoRA/config/sample_prompt.txt', '--dataset_config=/content/LoRA/config/dataset_config.toml', '--config_file=/content/LoRA/config/config_file.toml']' returned non-zero exit status 1.