bmaltais / kohya_ss

Apache License 2.0
9.43k stars 1.22k forks source link

I get an error when I train LORA #192

Closed Crimsonfart closed 1 year ago

Crimsonfart commented 1 year ago

Can someone help me. I get the following error when I Train LORA. In a Discord channel I saw that others got the exact same error.

Load CSS... Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch(). Captioning files in C:/Users/...../Documents/LORA Training Data/Test/image/100_test... .\venv\Scripts\python.exe "finetune/make_captions.py" --batch_size="1" --num_beams="1" --top_p="0.9" --max_length="75" --min_length="5" --beam_search --caption_extension=".txt" "C:/Users/...../Documents/LORA Training Data/Test/image/100_test" --caption_weights="https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_large_caption.pth" Current Working Directory is: C:\Users.....\Documents\kohya\kohya_ss load images from C:\Users.....\Documents\LORA Training Data\Test\image\100_test found 11 images. loading BLIP caption: https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_large_caption.pth Downloading (…)solve/main/vocab.txt: 100%|███████████████████████████████████████████| 232k/232k [00:00<00:00, 408kB/s] Downloading (…)okenizer_config.json: 100%|██████████████████████████████████████████| 28.0/28.0 [00:00<00:00, 28.5kB/s] Downloading (…)lve/main/config.json: 100%|█████████████████████████████████████████████| 570/570 [00:00<00:00, 286kB/s] 100%|█████████████████████████████████████████████████████████████████████████████| 1.66G/1.66G [02:23<00:00, 12.5MB/s] load checkpoint from https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_large_caption.pth BLIP loaded 100%|██████████████████████████████████████████████████████████████████████████████████| 11/11 [00:06<00:00, 1.74it/s] done! ...captioning done Folder 100_test: 1100 steps max_train_steps = 550 stop_text_encoder_training = 0 lr_warmup_steps = 55 accelerate launch --num_cpu_threads_per_process=2 "train_network.py" --enable_bucket --pretrained_model_name_or_path="C:/Apps/AI/stable-diffusion-webui/models/Stable-diffusion/realisticVisionV13_v13VAEIncluded.safetensors" --train_data_dir="C:/Users/basil/Documents/LORA Training Data/Test/image" --resolution=768,768 --output_dir="C:/Users/basil/Documents/LORA Training Data/Test/model" --logging_dir="C:/Users/basil/Documents/LORA Training Data/Test/log" --network_alpha="1" --save_model_as=safetensors --network_module=networks.lora --text_encoder_lr=5e-5 --unet_lr=0.0001 --network_dim=8 --output_name="last" --lr_scheduler_num_cycles="1" --learning_rate="0.0001" --lr_scheduler="cosine" --lr_warmup_steps="55" --train_batch_size="2" --max_train_steps="550" --save_every_n_epochs="1" --mixed_precision="fp16" --save_precision="fp16" --seed="1234" --cache_latents --bucket_reso_steps=64 --xformers --use_8bit_adam --bucket_no_upscale prepare tokenizer Use DreamBooth method. prepare train images. found directory 100_test contains 11 image files 1100 train images with repeating. loading image sizes. 100%|█████████████████████████████████████████████████████████████████████████████████| 11/11 [00:00<00:00, 923.71it/s] make buckets min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is set, because bucket reso is defined by image size automatically / bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計算されるため、min_bucket_resoとmax_bucket_resoは無視されます number of images (including repeats) / 各bucketの画像枚数(繰り返し回数を含む) bucket 0: resolution (768, 768), count: 1100 mean ar error (without repeats): 0.0 prepare accelerator Using accelerator 0.15.0 or above. load StableDiffusion checkpoint loading u-net: loading vae: Some weights of the model checkpoint at openai/clip-vit-large-patch14 were not used when initializing CLIPTextModel: ['vision_model.encoder.layers.20.self_attn.v_proj.bias', 'vision_model.encoder.layers.9.mlp.fc2.weight', 'vision_model.encoder.layers.23.mlp.fc1.bias', 'vision_model.encoder.layers.10.self_attn.out_proj.weight', 'vision_model.encoder.layers.12.self_attn.v_proj.weight', 'vision_model.encoder.layers.17.mlp.fc2.weight', 'vision_model.encoder.layers.0.layer_norm1.weight', 'vision_model.encoder.layers.2.self_attn.out_proj.weight', 'vision_model.encoder.layers.16.self_attn.out_proj.weight', 'vision_model.encoder.layers.7.layer_norm2.weight', 'vision_model.encoder.layers.6.self_attn.out_proj.weight', 'vision_model.encoder.layers.5.layer_norm1.bias', 'vision_model.encoder.layers.11.mlp.fc1.bias', 'vision_model.encoder.layers.5.self_attn.q_proj.bias', 'vision_model.encoder.layers.11.layer_norm2.bias', 'vision_model.encoder.layers.4.self_attn.v_proj.bias', 'vision_model.encoder.layers.16.mlp.fc2.bias', 'vision_model.encoder.layers.16.self_attn.out_proj.bias', 'vision_model.encoder.layers.3.self_attn.k_proj.weight', 'vision_model.encoder.layers.9.self_attn.q_proj.bias', 'vision_model.encoder.layers.21.layer_norm1.weight', 'vision_model.encoder.layers.22.self_attn.out_proj.bias', 'vision_model.encoder.layers.14.mlp.fc2.bias', 'vision_model.encoder.layers.7.self_attn.v_proj.weight', 'vision_model.encoder.layers.15.self_attn.k_proj.weight', 'vision_model.encoder.layers.13.mlp.fc1.bias', 'vision_model.encoder.layers.2.mlp.fc2.bias', 'vision_model.encoder.layers.12.mlp.fc2.weight', 'vision_model.encoder.layers.13.self_attn.out_proj.bias', 'vision_model.encoder.layers.12.self_attn.out_proj.bias', 'vision_model.encoder.layers.7.mlp.fc1.bias', 'vision_model.encoder.layers.8.self_attn.out_proj.bias', 'vision_model.encoder.layers.15.self_attn.k_proj.bias', 'vision_model.encoder.layers.13.self_attn.k_proj.bias', 'vision_model.encoder.layers.19.self_attn.out_proj.bias', 'vision_model.encoder.layers.21.mlp.fc1.weight', 'vision_model.encoder.layers.5.self_attn.v_proj.weight', 'vision_model.encoder.layers.8.mlp.fc2.weight', 'vision_model.encoder.layers.12.self_attn.v_proj.bias', 'vision_model.encoder.layers.20.self_attn.q_proj.weight', 'vision_model.encoder.layers.3.layer_norm2.bias', 'vision_model.encoder.layers.19.mlp.fc2.bias', 'vision_model.encoder.layers.7.self_attn.v_proj.bias', 'vision_model.encoder.layers.14.self_attn.k_proj.bias', 'vision_model.encoder.layers.20.self_attn.q_proj.bias', 'vision_model.encoder.layers.1.self_attn.k_proj.bias', 'vision_model.encoder.layers.10.self_attn.k_proj.weight', 'vision_model.encoder.layers.3.self_attn.v_proj.weight', 'vision_model.encoder.layers.4.self_attn.q_proj.weight', 'vision_model.encoder.layers.12.layer_norm1.weight', 'vision_model.encoder.layers.1.mlp.fc1.bias', 'vision_model.encoder.layers.23.layer_norm2.weight', 'vision_model.encoder.layers.18.layer_norm2.bias', 'vision_model.encoder.layers.16.layer_norm2.weight', 'vision_model.encoder.layers.12.self_attn.q_proj.bias', 'vision_model.encoder.layers.17.self_attn.out_proj.weight', 'visual_projection.weight', 'vision_model.encoder.layers.8.mlp.fc2.bias', 'vision_model.encoder.layers.4.layer_norm1.bias', 'vision_model.encoder.layers.6.self_attn.q_proj.weight', 'vision_model.encoder.layers.22.self_attn.out_proj.weight', 'vision_model.encoder.layers.19.mlp.fc2.weight', 'vision_model.encoder.layers.23.self_attn.q_proj.bias', 'vision_model.encoder.layers.16.mlp.fc2.weight', 'vision_model.encoder.layers.15.self_attn.v_proj.weight', 'vision_model.encoder.layers.8.self_attn.q_proj.bias', 'vision_model.encoder.layers.23.self_attn.out_proj.bias', 'vision_model.encoder.layers.3.self_attn.out_proj.weight', 'vision_model.encoder.layers.15.self_attn.v_proj.bias', 'vision_model.encoder.layers.10.self_attn.q_proj.bias', 'vision_model.encoder.layers.17.mlp.fc1.bias', 'vision_model.encoder.layers.9.self_attn.v_proj.bias', 'vision_model.encoder.layers.19.self_attn.v_proj.weight', 'vision_model.encoder.layers.2.mlp.fc1.bias', 'vision_model.encoder.layers.19.mlp.fc1.weight', 'vision_model.encoder.layers.2.self_attn.k_proj.bias', 'vision_model.encoder.layers.19.layer_norm2.bias', 'vision_model.encoder.layers.7.layer_norm1.weight', 'vision_model.encoder.layers.12.layer_norm2.bias', 'vision_model.encoder.layers.9.self_attn.out_proj.bias', 'vision_model.encoder.layers.14.layer_norm2.bias', 'vision_model.encoder.layers.2.layer_norm1.bias', 'vision_model.encoder.layers.5.mlp.fc1.weight', 'vision_model.encoder.layers.16.self_attn.v_proj.bias', 'vision_model.encoder.layers.3.mlp.fc1.weight', 'vision_model.encoder.layers.17.layer_norm1.weight', 'vision_model.encoder.layers.12.mlp.fc1.bias', 'vision_model.encoder.layers.10.mlp.fc2.weight', 'vision_model.encoder.layers.12.self_attn.out_proj.weight', 'vision_model.encoder.layers.20.self_attn.k_proj.bias', 'vision_model.encoder.layers.20.self_attn.v_proj.weight', 'vision_model.encoder.layers.21.self_attn.v_proj.weight', 'vision_model.encoder.layers.14.self_attn.q_proj.weight', 'vision_model.encoder.layers.12.mlp.fc1.weight', 'vision_model.encoder.layers.7.mlp.fc2.weight', 'vision_model.encoder.layers.13.mlp.fc2.bias', 'vision_model.encoder.layers.5.mlp.fc2.weight', 'vision_model.encoder.layers.18.self_attn.v_proj.weight', 'vision_model.encoder.layers.13.self_attn.v_proj.weight', 'vision_model.encoder.layers.20.layer_norm2.weight', 'vision_model.encoder.layers.1.mlp.fc2.weight', 'vision_model.encoder.layers.10.mlp.fc1.weight', 'vision_model.encoder.layers.3.self_attn.out_proj.bias', 'vision_model.encoder.layers.8.self_attn.q_proj.weight', 'vision_model.encoder.layers.4.self_attn.v_proj.weight', 'vision_model.encoder.layers.5.mlp.fc2.bias', 'vision_model.encoder.layers.0.mlp.fc1.weight', 'vision_model.encoder.layers.1.layer_norm2.bias', 'vision_model.encoder.layers.13.self_attn.v_proj.bias', 'vision_model.encoder.layers.21.mlp.fc2.bias', 'vision_model.encoder.layers.4.self_attn.k_proj.bias', 'vision_model.encoder.layers.23.self_attn.q_proj.weight', 'vision_model.encoder.layers.13.self_attn.out_proj.weight', 'vision_model.encoder.layers.14.mlp.fc1.weight', 'vision_model.encoder.layers.7.self_attn.q_proj.weight', 'vision_model.encoder.layers.15.layer_norm1.weight', 'vision_model.encoder.layers.22.self_attn.k_proj.bias', 'vision_model.encoder.layers.22.self_attn.q_proj.bias', 'vision_model.encoder.layers.17.layer_norm1.bias', 'vision_model.encoder.layers.13.mlp.fc2.weight', 'vision_model.encoder.layers.4.mlp.fc2.weight', 'vision_model.encoder.layers.5.self_attn.k_proj.bias', 'vision_model.encoder.layers.10.self_attn.out_proj.bias', 'vision_model.encoder.layers.11.self_attn.v_proj.bias', 'vision_model.encoder.layers.11.self_attn.k_proj.bias', 'vision_model.encoder.layers.16.layer_norm1.weight', 'vision_model.encoder.layers.21.mlp.fc1.bias', 'vision_model.encoder.layers.15.mlp.fc2.weight', 'vision_model.encoder.layers.18.layer_norm2.weight', 'vision_model.encoder.layers.18.self_attn.v_proj.bias', 'vision_model.encoder.layers.18.mlp.fc1.bias', 'vision_model.encoder.layers.9.self_attn.k_proj.bias', 'vision_model.encoder.layers.8.self_attn.v_proj.bias', 'vision_model.encoder.layers.6.self_attn.v_proj.weight', 'vision_model.encoder.layers.4.mlp.fc1.bias', 'vision_model.encoder.layers.14.self_attn.v_proj.weight', 'vision_model.encoder.layers.4.mlp.fc2.bias', 'vision_model.encoder.layers.18.self_attn.k_proj.weight', 'vision_model.encoder.layers.1.layer_norm1.weight', 'vision_model.encoder.layers.21.mlp.fc2.weight', 'vision_model.encoder.layers.20.mlp.fc2.bias', 'vision_model.encoder.layers.12.mlp.fc2.bias', 'vision_model.encoder.layers.21.self_attn.out_proj.weight', 'vision_model.encoder.layers.0.self_attn.out_proj.weight', 'vision_model.encoder.layers.13.layer_norm2.bias', 'vision_model.encoder.layers.18.mlp.fc2.bias', 'vision_model.encoder.layers.0.mlp.fc1.bias', 'vision_model.encoder.layers.15.self_attn.q_proj.weight', 'vision_model.encoder.layers.18.self_attn.q_proj.bias', 'vision_model.encoder.layers.1.self_attn.out_proj.bias', 'vision_model.encoder.layers.10.self_attn.k_proj.bias', 'vision_model.encoder.layers.23.mlp.fc1.weight', 'vision_model.encoder.layers.0.self_attn.k_proj.bias', 'vision_model.encoder.layers.11.layer_norm2.weight', 'vision_model.encoder.layers.1.self_attn.q_proj.weight', 'vision_model.embeddings.patch_embedding.weight', 'vision_model.encoder.layers.8.layer_norm1.bias', 'vision_model.encoder.layers.11.layer_norm1.weight', 'vision_model.encoder.layers.11.mlp.fc1.weight', 'vision_model.encoder.layers.6.layer_norm1.bias', 'vision_model.encoder.layers.19.layer_norm2.weight', 'vision_model.encoder.layers.2.self_attn.k_proj.weight', 'vision_model.encoder.layers.14.self_attn.out_proj.bias', 'vision_model.encoder.layers.16.mlp.fc1.weight', 'vision_model.encoder.layers.5.layer_norm2.bias', 'vision_model.encoder.layers.23.self_attn.out_proj.weight', 'vision_model.encoder.layers.1.mlp.fc1.weight', 'vision_model.encoder.layers.19.self_attn.q_proj.bias', 'vision_model.encoder.layers.0.self_attn.v_proj.bias', 'vision_model.encoder.layers.15.mlp.fc2.bias', 'vision_model.encoder.layers.18.mlp.fc2.weight', 'vision_model.encoder.layers.10.layer_norm2.weight', 'vision_model.pre_layrnorm.weight', 'vision_model.encoder.layers.6.layer_norm1.weight', 'vision_model.encoder.layers.1.self_attn.k_proj.weight', 'vision_model.encoder.layers.18.self_attn.q_proj.weight', 'vision_model.encoder.layers.10.mlp.fc1.bias', 'vision_model.encoder.layers.15.mlp.fc1.weight', 'vision_model.encoder.layers.23.self_attn.k_proj.bias', 'vision_model.encoder.layers.15.self_attn.out_proj.bias', 'vision_model.encoder.layers.17.layer_norm2.bias', 'vision_model.encoder.layers.5.self_attn.out_proj.bias', 'vision_model.encoder.layers.20.mlp.fc1.bias', 'vision_model.encoder.layers.11.self_attn.out_proj.bias', 'vision_model.pre_layrnorm.bias', 'vision_model.encoder.layers.19.self_attn.out_proj.weight', 'vision_model.encoder.layers.17.self_attn.out_proj.bias', 'vision_model.encoder.layers.5.self_attn.out_proj.weight', 'vision_model.encoder.layers.5.self_attn.k_proj.weight', 'vision_model.encoder.layers.22.layer_norm1.bias', 'vision_model.encoder.layers.8.mlp.fc1.bias', 'vision_model.encoder.layers.0.self_attn.q_proj.bias', 'vision_model.encoder.layers.12.self_attn.k_proj.weight', 'vision_model.encoder.layers.10.self_attn.q_proj.weight', 'vision_model.post_layernorm.bias', 'vision_model.encoder.layers.14.layer_norm1.bias', 'vision_model.encoder.layers.3.self_attn.q_proj.weight', 'vision_model.encoder.layers.9.mlp.fc2.bias', 'vision_model.encoder.layers.16.self_attn.v_proj.weight', 'vision_model.encoder.layers.0.self_attn.v_proj.weight', 'vision_model.encoder.layers.11.self_attn.out_proj.weight', 'vision_model.encoder.layers.3.layer_norm2.weight', 'vision_model.encoder.layers.17.mlp.fc2.bias', 'vision_model.encoder.layers.2.mlp.fc2.weight', 'vision_model.encoder.layers.11.self_attn.k_proj.weight', 'vision_model.encoder.layers.3.self_attn.v_proj.bias', 'vision_model.encoder.layers.0.mlp.fc2.weight', 'vision_model.encoder.layers.13.mlp.fc1.weight', 'vision_model.encoder.layers.21.layer_norm2.bias', 'vision_model.encoder.layers.0.self_attn.k_proj.weight', 'vision_model.encoder.layers.8.self_attn.k_proj.weight', 'vision_model.encoder.layers.13.self_attn.k_proj.weight', 'vision_model.encoder.layers.20.self_attn.k_proj.weight', 'vision_model.encoder.layers.11.self_attn.v_proj.weight', 'vision_model.encoder.layers.12.layer_norm1.bias', 'vision_model.encoder.layers.9.layer_norm2.bias', 'vision_model.encoder.layers.7.layer_norm1.bias', 'vision_model.encoder.layers.20.self_attn.out_proj.bias', 'vision_model.encoder.layers.14.layer_norm1.weight', 'vision_model.encoder.layers.9.layer_norm1.bias', 'vision_model.encoder.layers.1.self_attn.q_proj.bias', 'vision_model.encoder.layers.2.layer_norm2.bias', 'vision_model.encoder.layers.22.self_attn.k_proj.weight', 'vision_model.encoder.layers.4.mlp.fc1.weight', 'vision_model.post_layernorm.weight', 'vision_model.encoder.layers.9.mlp.fc1.weight', 'vision_model.encoder.layers.17.self_attn.k_proj.weight', 'vision_model.encoder.layers.21.self_attn.q_proj.weight', 'vision_model.encoder.layers.1.self_attn.v_proj.weight', 'logit_scale', 'vision_model.encoder.layers.9.self_attn.k_proj.weight', 'vision_model.encoder.layers.18.self_attn.out_proj.bias', 'vision_model.encoder.layers.10.self_attn.v_proj.weight', 'vision_model.encoder.layers.23.self_attn.v_proj.bias', 'vision_model.encoder.layers.2.self_attn.q_proj.weight', 'vision_model.encoder.layers.2.self_attn.v_proj.bias', 'vision_model.encoder.layers.11.mlp.fc2.bias', 'vision_model.encoder.layers.9.self_attn.q_proj.weight', 'vision_model.encoder.layers.16.self_attn.q_proj.bias', 'vision_model.encoder.layers.22.layer_norm2.weight', 'vision_model.encoder.layers.6.mlp.fc1.weight', 'vision_model.encoder.layers.6.self_attn.k_proj.weight', 'vision_model.encoder.layers.13.layer_norm1.bias', 'vision_model.encoder.layers.20.self_attn.out_proj.weight', 'vision_model.encoder.layers.7.mlp.fc2.bias', 'vision_model.encoder.layers.11.self_attn.q_proj.bias', 'vision_model.encoder.layers.8.self_attn.out_proj.weight', 'vision_model.encoder.layers.0.self_attn.out_proj.bias', 'vision_model.encoder.layers.10.layer_norm1.weight', 'vision_model.encoder.layers.7.layer_norm2.bias', 'vision_model.encoder.layers.12.self_attn.k_proj.bias', 'vision_model.encoder.layers.11.layer_norm1.bias', 'vision_model.encoder.layers.19.self_attn.v_proj.bias', 'vision_model.encoder.layers.3.layer_norm1.bias', 'vision_model.encoder.layers.3.layer_norm1.weight', 'vision_model.encoder.layers.5.layer_norm2.weight', 'vision_model.encoder.layers.14.layer_norm2.weight', 'vision_model.encoder.layers.4.self_attn.k_proj.weight', 'vision_model.encoder.layers.9.self_attn.v_proj.weight', 'vision_model.encoder.layers.17.self_attn.v_proj.bias', 'vision_model.encoder.layers.0.mlp.fc2.bias', 'vision_model.encoder.layers.3.self_attn.k_proj.bias', 'vision_model.encoder.layers.17.self_attn.q_proj.weight', 'vision_model.encoder.layers.15.layer_norm1.bias', 'vision_model.encoder.layers.9.mlp.fc1.bias', 'vision_model.encoder.layers.3.mlp.fc2.bias', 'vision_model.encoder.layers.3.mlp.fc2.weight', 'vision_model.encoder.layers.0.layer_norm1.bias', 'vision_model.encoder.layers.22.layer_norm2.bias', 'vision_model.encoder.layers.4.self_attn.q_proj.bias', 'vision_model.encoder.layers.4.layer_norm2.bias', 'vision_model.encoder.layers.13.self_attn.q_proj.bias', 'vision_model.encoder.layers.23.mlp.fc2.bias', 'vision_model.embeddings.position_ids', 'vision_model.encoder.layers.19.self_attn.q_proj.weight', 'vision_model.encoder.layers.5.self_attn.v_proj.bias', 'vision_model.encoder.layers.15.layer_norm2.bias', 'vision_model.encoder.layers.13.self_attn.q_proj.weight', 'vision_model.encoder.layers.22.mlp.fc1.weight', 'vision_model.encoder.layers.13.layer_norm2.weight', 'vision_model.encoder.layers.2.mlp.fc1.weight', 'vision_model.encoder.layers.15.self_attn.q_proj.bias', 'vision_model.encoder.layers.5.mlp.fc1.bias', 'vision_model.encoder.layers.13.layer_norm1.weight', 'vision_model.encoder.layers.14.self_attn.q_proj.bias', 'vision_model.encoder.layers.16.self_attn.k_proj.weight', 'vision_model.encoder.layers.7.self_attn.k_proj.bias', 'vision_model.encoder.layers.14.mlp.fc1.bias', 'vision_model.encoder.layers.17.self_attn.v_proj.weight', 'vision_model.encoder.layers.2.self_attn.v_proj.weight', 'vision_model.encoder.layers.21.layer_norm2.weight', 'vision_model.encoder.layers.7.self_attn.out_proj.bias', 'vision_model.encoder.layers.14.self_attn.v_proj.bias', 'vision_model.encoder.layers.6.self_attn.v_proj.bias', 'vision_model.encoder.layers.23.layer_norm2.bias', 'vision_model.encoder.layers.22.self_attn.v_proj.weight', 'vision_model.encoder.layers.2.self_attn.out_proj.bias', 'vision_model.embeddings.class_embedding', 'vision_model.embeddings.position_embedding.weight', 'vision_model.encoder.layers.18.self_attn.out_proj.weight', 'vision_model.encoder.layers.14.self_attn.k_proj.weight', 'vision_model.encoder.layers.2.layer_norm1.weight', 'vision_model.encoder.layers.6.mlp.fc2.bias', 'vision_model.encoder.layers.21.layer_norm1.bias', 'vision_model.encoder.layers.1.self_attn.out_proj.weight', 'vision_model.encoder.layers.8.layer_norm2.weight', 'vision_model.encoder.layers.1.self_attn.v_proj.bias', 'vision_model.encoder.layers.18.layer_norm1.weight', 'vision_model.encoder.layers.21.self_attn.out_proj.bias', 'vision_model.encoder.layers.23.layer_norm1.bias', 'vision_model.encoder.layers.11.mlp.fc2.weight', 'vision_model.encoder.layers.12.layer_norm2.weight', 'vision_model.encoder.layers.9.self_attn.out_proj.weight', 'vision_model.encoder.layers.20.layer_norm2.bias', 'vision_model.encoder.layers.2.layer_norm2.weight', 'vision_model.encoder.layers.8.self_attn.v_proj.weight', 'vision_model.encoder.layers.21.self_attn.q_proj.bias', 'vision_model.encoder.layers.15.self_attn.out_proj.weight', 'vision_model.encoder.layers.20.layer_norm1.weight', 'vision_model.encoder.layers.20.mlp.fc2.weight', 'vision_model.encoder.layers.17.mlp.fc1.weight', 'vision_model.encoder.layers.4.self_attn.out_proj.weight', 'vision_model.encoder.layers.22.self_attn.v_proj.bias', 'vision_model.encoder.layers.17.self_attn.k_proj.bias', 'vision_model.encoder.layers.16.layer_norm2.bias', 'vision_model.encoder.layers.12.self_attn.q_proj.weight', 'vision_model.encoder.layers.5.layer_norm1.weight', 'vision_model.encoder.layers.22.self_attn.q_proj.weight', 'vision_model.encoder.layers.7.mlp.fc1.weight', 'vision_model.encoder.layers.19.mlp.fc1.bias', 'vision_model.encoder.layers.22.mlp.fc1.bias', 'vision_model.encoder.layers.6.layer_norm2.weight', 'vision_model.encoder.layers.14.self_attn.out_proj.weight', 'vision_model.encoder.layers.20.mlp.fc1.weight', 'vision_model.encoder.layers.10.layer_norm2.bias', 'vision_model.encoder.layers.21.self_attn.v_proj.bias', 'vision_model.encoder.layers.19.layer_norm1.weight', 'vision_model.encoder.layers.18.mlp.fc1.weight', 'vision_model.encoder.layers.23.self_attn.k_proj.weight', 'vision_model.encoder.layers.1.mlp.fc2.bias', 'vision_model.encoder.layers.22.mlp.fc2.weight', 'vision_model.encoder.layers.23.mlp.fc2.weight', 'vision_model.encoder.layers.5.self_attn.q_proj.weight', 'vision_model.encoder.layers.16.self_attn.q_proj.weight', 'vision_model.encoder.layers.1.layer_norm2.weight', 'vision_model.encoder.layers.21.self_attn.k_proj.bias', 'vision_model.encoder.layers.10.self_attn.v_proj.bias', 'vision_model.encoder.layers.16.self_attn.k_proj.bias', 'vision_model.encoder.layers.3.mlp.fc1.bias', 'vision_model.encoder.layers.15.layer_norm2.weight', 'vision_model.encoder.layers.17.layer_norm2.weight', 'vision_model.encoder.layers.6.mlp.fc2.weight', 'vision_model.encoder.layers.8.self_attn.k_proj.bias', 'vision_model.encoder.layers.19.self_attn.k_proj.weight', 'vision_model.encoder.layers.16.layer_norm1.bias', 'vision_model.encoder.layers.8.layer_norm1.weight', 'vision_model.encoder.layers.9.layer_norm2.weight', 'vision_model.encoder.layers.22.layer_norm1.weight', 'vision_model.encoder.layers.6.self_attn.out_proj.bias', 'vision_model.encoder.layers.4.layer_norm2.weight', 'vision_model.encoder.layers.3.self_attn.q_proj.bias', 'vision_model.encoder.layers.4.layer_norm1.weight', 'vision_model.encoder.layers.6.self_attn.k_proj.bias', 'vision_model.encoder.layers.8.mlp.fc1.weight', 'vision_model.encoder.layers.7.self_attn.k_proj.weight', 'vision_model.encoder.layers.16.mlp.fc1.bias', 'vision_model.encoder.layers.0.layer_norm2.weight', 'vision_model.encoder.layers.21.self_attn.k_proj.weight', 'vision_model.encoder.layers.4.self_attn.out_proj.bias', 'vision_model.encoder.layers.0.self_attn.q_proj.weight', 'vision_model.encoder.layers.18.layer_norm1.bias', 'vision_model.encoder.layers.18.self_attn.k_proj.bias', 'vision_model.encoder.layers.15.mlp.fc1.bias', 'vision_model.encoder.layers.10.layer_norm1.bias', 'vision_model.encoder.layers.22.mlp.fc2.bias', 'vision_model.encoder.layers.19.layer_norm1.bias', 'vision_model.encoder.layers.8.layer_norm2.bias', 'text_projection.weight', 'vision_model.encoder.layers.2.self_attn.q_proj.bias', 'vision_model.encoder.layers.6.self_attn.q_proj.bias', 'vision_model.encoder.layers.11.self_attn.q_proj.weight', 'vision_model.encoder.layers.9.layer_norm1.weight', 'vision_model.encoder.layers.14.mlp.fc2.weight', 'vision_model.encoder.layers.17.self_attn.q_proj.bias', 'vision_model.encoder.layers.7.self_attn.out_proj.weight', 'vision_model.encoder.layers.10.mlp.fc2.bias', 'vision_model.encoder.layers.23.self_attn.v_proj.weight', 'vision_model.encoder.layers.6.mlp.fc1.bias', 'vision_model.encoder.layers.19.self_attn.k_proj.bias', 'vision_model.encoder.layers.23.layer_norm1.weight', 'vision_model.encoder.layers.20.layer_norm1.bias', 'vision_model.encoder.layers.7.self_attn.q_proj.bias', 'vision_model.encoder.layers.1.layer_norm1.bias', 'vision_model.encoder.layers.6.layer_norm2.bias', 'vision_model.encoder.layers.0.layer_norm2.bias']

===================================BUG REPORT=================================== Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues For effortless bug reporting copy-paste your error into this form: https://docs.google.com/forms/d/e/1FAIpQLScPB8emS3Thkp66nvqwmjTEgxp8Y9ufuWTzFyr9kJ5AoI47dQ/viewform?usp=sf_link

CUDA SETUP: Loading binary C:\Users.....\Documents\kohya\kohya_ss\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cuda116.dll... use 8-bit Adam optimizer running training / 学習開始 num train images * repeats / 学習画像の数×繰り返し回数: 1100 num reg images / 正則化画像の数: 0 num batches per epoch / 1epochのバッチ数: 550 num epochs / epoch数: 1 batch size per device / バッチサイズ: 2 total train batch size (with parallel & distributed & accumulation) / 総バッチサイズ(並列学習、勾配合計含む): 2 gradient accumulation steps / 勾配を合計するステップ数 = 1 total optimization steps / 学習ステップ数: 550 Traceback (most recent call last): File "C:\Users.....\Documents\kohya\kohya_ss\train_network.py", line 573, in train(args) File "C:\Users.....\Documents\kohya\kohya_ss\train_network.py", line 356, in train "ss_noise_offset": args.noise_offset, AttributeError: 'Namespace' object has no attribute 'noise_offset' Traceback (most recent call last): File "C:\Users.....\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\basil\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "C:\Users.....\Documents\kohya\kohya_ss\venv\Scripts\accelerate.exe__main__.py", line 7, in File "C:\Users.....\Documents\kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main args.func(args) File "C:\Users.....l\Documents\kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command simple_launcher(args) File "C:\Users\basil\Documents\kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['C:\Users\basil\Documents\kohya\kohya_ss\venv\Scripts\python.exe', 'train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=C:/Apps/AI/stable-diffusion-webui/models/Stable-diffusion/realisticVisionV13_v13VAEIncluded.safetensors', '--train_data_dir=C:/Users/basil/Documents/LORA Training Data/Test/image', '--resolution=768,768', '--output_dir=C:/Users/basil/Documents/LORA Training Data/Test/model', '--logging_dir=C:/Users/basil/Documents/LORA Training Data/Test/log', '--network_alpha=1', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-5', '--unet_lr=0.0001', '--network_dim=8', '--output_name=last', '--lr_scheduler_num_cycles=1', '--learning_rate=0.0001', '--lr_scheduler=cosine', '--lr_warmup_steps=55', '--train_batch_size=2', '--max_train_steps=550', '--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--seed=1234', '--cache_latents', '--bucket_reso_steps=64', '--xformers', '--use_8bit_adam', '--bucket_no_upscale']' returned non-zero exit status 1.

Rika-Mipa commented 1 year ago

Yeah, if you update (git pull) today, then you can not train model any more. you will receive error logs like this: line 45, in main args.func(args) line 1104, in launch_command line 567, in simple_launcher

Nyaster commented 1 year ago

just revert to previsious commin

martianunlimited commented 1 year ago

Alternatively, just replace library/train_util.py with kohya's new version https://github.com/kohya-ss/sd-scripts/blob/main/library/train_util.py

Rika-Mipa commented 1 year ago

Alternatively, just replace library/train_util.py with kohya's new version https://github.com/kohya-ss/sd-scripts/blob/main/library/train_util.py

the software can work after replace train.util thank you for your help

Thund3rPat commented 1 year ago

I was able to resolve this issue by merging the dev branch into the master branch. This commit 641a168e55f429c79f9114bcdb123a13bc9b2167 resolved it for me and was probably forgotten.

reduxo1 commented 1 year ago

Alternatively, just replace library/train_util.py with kohya's new version https://github.com/kohya-ss/sd-scripts/blob/main/library/train_util.py

fixed for me too, ty

MalpoDeMalpis commented 1 year ago

AttributeError: 'Namespace' object has no attribute 'noise_offset'

Same problem here, and replace train_util.py is not a solution

Maranpani commented 1 year ago

It's not fixed the problem at all :( . Is someone have an other idea please?

I replaced library/train_util.py with kohya's new version https://github.com/kohya-ss/sd-scripts/blob/main/library/train_util.py

CUDA SETUP: Loading binary C:\Users\Utilisateur\Documents\Kohya\kohya_ss\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cuda116.dll... use 8-bit Adam optimizer running training / 学習開始 num train images * repeats / 学習画像の数×繰り返し回数: 1700 num reg images / 正則化画像の数: 0 num batches per epoch / 1epochのバッチ数: 850 num epochs / epoch数: 1 batch size per device / バッチサイズ: 2 total train batch size (with parallel & distributed & accumulation) / 総バッチサイズ(並列学習、勾配合計含む): 2 gradient accumulation steps / 勾配を合計するステップ数 = 1 total optimization steps / 学習ステップ数: 850 steps: 0%| | 0/850 [00:00<?, ?it/s]epoch 1/1 C:\Users\Utilisateur\Documents\Kohya\kohya_ss\venv\lib\site-packages\torch\utils\checkpoint.py:25: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn("None of the inputs have requires_grad=True. Gradients will be None") Error no kernel image is available for execution on the device at line 167 in file D:\ai\tool\bitsandbytes\csrc\ops.cu Traceback (most recent call last): File "C:\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "C:\Users\Utilisateur\Documents\Kohya\kohya_ss\venv\Scripts\accelerate.exe__main__.py", line 7, in File "C:\Users\Utilisateur\Documents\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main args.func(args) File "C:\Users\Utilisateur\Documents\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command simple_launcher(args) File "C:\Users\Utilisateur\Documents\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['C:\Users\Utilisateur\Documents\Kohya\kohya_ss\venv\Scripts\python.exe', 'train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=//UTILISATEUR-PC/Users/Utilisateur/stable-diffusion-webui/models/Stable-diffusion/realisticVisionV13_v13.safetensors', '--train_data_dir=C:\Users\Utilisateur\Documents\Lora TRaining DAta\test\image', '--resolution=512,512', '--output_dir=C:\Users\Utilisateur\Documents\Lora TRaining DAta\test\model', '--logging_dir=C:\Users\Utilisateur\Documents\Lora TRaining DAta\test\log', '--network_alpha=1', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-5', '--unet_lr=0.0001', '--network_dim=8', '--output_name=last', '--lr_scheduler_num_cycles=1', '--learning_rate=0.0001', '--lr_scheduler=cosine', '--lr_warmup_steps=85', '--train_batch_size=2', '--max_train_steps=850', '--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--seed=1234', '--cache_latents', '--bucket_reso_steps=64', '--mem_eff_attn', '--gradient_checkpointing', '--xformers', '--use_8bit_adam', '--bucket_no_upscale']' returned non-zero exit status 1.

NoteToSelfFindGoodNickname commented 1 year ago

I think you need to replace all 3 occurances of util_train.py in the kohya_ss folder. Then it worked for me, even after the update.

tpcdaz commented 1 year ago

I think you need to replace all 3 occurances of util_train.py in the kohya_ss folder. Then it worked for me, even after the update.

Replace them with what? There is only one instance of train_util and thats in the library folder

Maranpani commented 1 year ago

Could you do a tuto video or ask someone to do it ^^.

COuld you explain exactly what do you mean about : "replace all 3 occurances of util_train.py in the kohya_ss folder"

starpause commented 1 year ago

I think you need to replace all 3 occurances of util_train.py in the kohya_ss folder. Then it worked for me, even after the update.

Replace them with what? There is only one instance of train_util and thats in the library folder

\kohya_ss\library\train_util.py \kohya_ss\venv\Lib\site-packages\library\train_util.py \kohya_ss\build\lib\library\train_util.py

noobNerd1 commented 1 year ago

tried all of the above, nothing worked. Then I changed optimizer from Adam to Lion, and now it's working. I have no idea what that changes in terms of quality, etc. But hey at least I'm unstuck

Maranpani commented 1 year ago

I was able to resolve this issue by merging the dev branch into the master branch. This commit 641a168e55f429c79f9114bcdb123a13bc9b2167 resolved it for me and was probably forgotten.

And could you explain for a newbie what does it means and how doing it step by step please ?

Thund3rPat commented 1 year ago

@Maranpani With the last update it is now merged. No need to do it yourself anymore.

Maranpani commented 1 year ago

@Maranpani With the last update it is now merged. No need to do it yourself anymore.

Hello could you be more sharp ?

which update Udapte of what ? How to do the update ? Thanks

Thund3rPat commented 1 year ago

Hi, of course. To update to the newest release, open the terminal and go to the kohya_ss folder. Execute the upgrade script with:

.\upgrade.ps1

I hope this helps.

Maranpani commented 1 year ago

I think you need to replace all 3 occurances of util_train.py in the kohya_ss folder. Then it worked for me, even after the update.

Replace them with what? There is only one instance of train_util and thats in the library folder

\kohya_ss\library\train_util.py \kohya_ss\venv\Lib\site-packages\library\train_util.py \kohya_ss\build\lib\library\train_util.py

**It's not working for me

ALways CUDA error:**

CUDA SETUP: Loading binary C:\Users\Utilisateur\Documents\Kohya\kohya_ss\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cuda116.dll... use 8-bit Adam optimizer running training / 学習開始 num train images * repeats / 学習画像の数×繰り返し回数: 1700 num reg images / 正則化画像の数: 0 num batches per epoch / 1epochのバッチ数: 850 num epochs / epoch数: 1 batch size per device / バッチサイズ: 2 total train batch size (with parallel & distributed & accumulation) / 総バッチサイズ(並列学習、勾配合計含む): 2 gradient accumulation steps / 勾配を合計するステップ数 = 1 total optimization steps / 学習ステップ数: 850 steps: 0%| | 0/850 [00:00<?, ?it/s]epoch 1/1 C:\Users\Utilisateur\Documents\Kohya\kohya_ss\venv\lib\site-packages\torch\utils\checkpoint.py:25: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn("None of the inputs have requires_grad=True. Gradients will be None") Error no kernel image is available for execution on the device at line 167 in file D:\ai\tool\bitsandbytes\csrc\ops.cu Traceback (most recent call last): File "C:\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "C:\Users\Utilisateur\Documents\Kohya\kohya_ss\venv\Scripts\accelerate.exe__main__.py", line 7, in File "C:\Users\Utilisateur\Documents\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main args.func(args) File "C:\Users\Utilisateur\Documents\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command simple_launcher(args) File "C:\Users\Utilisateur\Documents\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['C:\Users\Utilisateur\Documents\Kohya\kohya_ss\venv\Scripts\python.exe', 'train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=//UTILISATEUR-PC/Users/Utilisateur/stable-diffusion-webui/models/Stable-diffusion/realisticVisionV13_v13.safetensors', '--train_data_dir=C:\Users\Utilisateur\Documents\Lora TRaining DAta\test\image', '--resolution=512,512', '--output_dir=C:\Users\Utilisateur\Documents\Lora TRaining DAta\test\model', '--logging_dir=C:\Users\Utilisateur\Documents\Lora TRaining DAta\test\log', '--network_alpha=1', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-5', '--unet_lr=0.0001', '--network_dim=8', '--output_name=last', '--lr_scheduler_num_cycles=1', '--learning_rate=0.0001', '--lr_scheduler=cosine', '--lr_warmup_steps=85', '--train_batch_size=2', '--max_train_steps=850', '--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--seed=1234', '--cache_latents', '--bucket_reso_steps=64', '--mem_eff_attn', '--gradient_checkpointing', '--xformers', '--use_8bit_adam', '--bucket_no_upscale']' returned non-zero exit status 1.

Thund3rPat commented 1 year ago

@Maranpani Can you deactivate 8bit adam and try again?

Maranpani commented 1 year ago

WHen i click to execute, a windows appear and all become automatic then close. I don't have the possibility to write anything. Is i misunderstood something ?

For information i follow the tutorial of Olivier SArikas on youtube who said to put it in the "upgrade" folder So For now that is writed in my folder is :

git pull .\venv\Scripts\activate pip install --upgrade -r requirements.txt

What should i replace, all ? What do you mean by exécute ? Just write something and save?

Maranpani commented 1 year ago

@Maranpani Can you deactivate 8bit adam and try again? thanks

It WORKS 👍 when we remove Use 8bit adam

So if it's working without 8bit adam few questions

  1. Are you able to fix the issue at the origins to help everyone with using 8bit adam ?
  2. What is exactly "8bit adam" ? what happens if we don' use it, versus if we use it ?
  3. For information , here my actual advenced parameters settings now (without using 8bit adam as you told me). Gradient checkpointing ON Shuffle caption OFF Persistent data loader OFF Memory efficient attention ON Use 8bit adam OFF Use xformers ON Color augmentation OFF Flip augmentation OFF Don't upscale bucket resolutio OFF
FlareWaterLily commented 1 year ago

train_network.py: error: unrecognized arguments: 768 Traceback (most recent call last): File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.2800.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.2800.0_x64qbz5n2kfra8p0\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "G:\LoRA\kohya_ss\venv\Scripts\accelerate.exe\main__.py", line 7, in File "G:\LoRA\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main args.func(args) File "G:\LoRA\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command simple_launcher(args) File "G:\LoRA\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['G:\LoRA\kohya_ss\venv\Scripts\python.exe', 'train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=G:/novelai-webui-aki/models/Stable-diffusion/Anything3.ckpt', '--train_data_dir=G:/LoRA/99a', '--resolution=640,', '768', '--output_dir=G:/LoRA/99a', '--logging_dir=', '--network_alpha=1', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=1.5e-5', '--unet_lr=1.5e-4', '--network_dim=128', '--output_name=99A', '--lr_scheduler_num_cycles=5', '--learning_rate=0.0001', '--lr_scheduler=constant_with_warmup', '--lr_warmup_steps=13', '--train_batch_size=3', '--max_train_steps=269', '--save_every_n_epochs=5', '--mixed_precision=fp16', '--save_precision=fp16', '--seed=31337', '--cache_latents', '--use_lion_optimizer', '--clip_skip=2', '--bucket_reso_steps=64', '--shuffle_caption', '--xformers', '--use_8bit_adam', '--bucket_no_upscale']' returned non-zero exit status 2.

Error still reported after replacing train_util.py。

Thund3rPat commented 1 year ago

@FlareWaterLily The error message says that you have set the resolution parameter wrong. Can you check again and set it like this:

640,768
kithungsam commented 1 year ago

I have same error and i untick "use 8bit adam" and choose Optimizer other than AdamW8bit then it starts to work

dokkkku commented 1 year ago

nothing worked, i tried everything. i still get the same error :(

Maranpani commented 1 year ago

nothing worked, i tried everything. i still get the same error :(

Unactivate 8bit adams

dokkkku commented 1 year ago

still didnt work

NakiriKajiya commented 1 year ago

i tried everything:( Traceback (most recent call last): File "F:\lora-scripts-0.2.0\sd-scripts\train_network.py", line 642, in train(args) File "F:\lora-scripts-0.2.0\sd-scripts\train_network.py", line 114, in train textencoder, vae, unet, = train_util.load_target_model(args, weight_dtype) File "F:\lora-scripts-0.2.0\sd-scripts\library\train_util.py", line 2016, in load_target_model text_encoder, vae, unet = model_util.load_models_from_stable_diffusion_checkpoint(args.v2, name_or_path) File "F:\lora-scripts-0.2.0\sd-scripts\library\model_util.py", line 877, in load_models_from_stable_diffusion_checkpoint converted_unet_checkpoint = convert_ldm_unet_checkpoint(v2, state_dict, unet_config) File "F:\lora-scripts-0.2.0\sd-scripts\library\model_util.py", line 234, in convert_ldm_unet_checkpoint new_checkpoint["time_embedding.linear_1.weight"] = unet_state_dict["time_embed.0.weight"] KeyError: 'time_embed.0.weight' Traceback (most recent call last): File "C:\Users\keina\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\keina\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "F:\lora-scripts-0.2.0\venv\Scripts\accelerate.exe__main__.py", line 7, in File "F:\lora-scripts-0.2.0\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main args.func(args) File "F:\lora-scripts-0.2.0\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command simple_launcher(args) File "F:\lora-scripts-0.2.0\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['F:\lora-scripts-0.2.0\venv\Scripts\python.exe', './sd-scripts/train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=./sd-models/before.safetensors', '--train_data_dir=./train', '--output_dir=./output', '--logging_dir=./logs', '--resolution=512,768', '--network_module=networks.lora', '--max_train_epochs=20', '--learning_rate=1e-4', '--unet_lr=1e-4', '--text_encoder_lr=1e-5', '--lr_scheduler=cosine_with_restarts', '--lr_warmup_steps=0', '--lr_scheduler_num_cycles=1', '--network_dim=64', '--network_alpha=32', '--output_name=after', '--train_batch_size=3', '--save_every_n_epochs=2', '--mixed_precision=fp16', '--save_precision=fp16', '--seed=1337', '--cache_latents', '--clip_skip=2', '--prior_loss_weight=1', '--max_token_length=225', '--caption_extension=.txt', '--save_model_as=safetensors', '--min_bucket_reso=256', '--max_bucket_reso=1024', '--xformers', '--shuffle_caption', '--reg_data_dir=./train/reg', '--use_8bit_adam', '--use_lion_optimizer']' returned non-zero exit status 1. Train finished

youcanyoubing commented 1 year ago

我尝试了一切:(回溯(最近一次调用):文件“F:\lora-scripts-0.2.0\sd-scripts\train_network.py”,第 642 行,在火车(args) 文件“F:\lora-scripts-0.2.0\sd-scripts\train_network.py”,第 114 行,在火车textencoder中,VAE,unet, = train_util.load_target_model(args,weight_dtype) 文件“F:\lora-scripts-0.2.0\sd-scripts\library\train_util.py”,第 2016 行,load_target_model text_encoder, vae, unet = model_util.load_models_from_stable_diffusion_checkpoint(args.v2, name_or_path) 文件 “F:\lora-scripts-0.2.0\sd-scripts\library\model_util.py”,第 877 行,在 load_models_from_stable_diffusion_checkpoint converted_unet_checkpoint = convert_ldm_unet_checkpoint(v2, state_dict, unet_config) 文件 “F:\lora-scripts-0.2.0\sd-scripts\library\model_util.py”,第 234 行,convert_ldm_unet_checkpoint new_checkpoint[“time_embedding.linear_1.weight”] = unet_state_dict[“time_embed.0.weight”] KeyError: 'time_embed.0.weight' 回溯(最近一次调用): 文件 “C:\Users\keina\AppData\Local\Programs\Python\Python310\lib\runpy.py”,第 196 行,在_run_module_as_main返回_run_code(代码、main_globals、无、文件“C:\Users\keina\AppData\Local\Programs\Python\Python310\lib\runpy.py”,第 86 行, _在run_code exec(code,run_globals)文件中 “F:\lora-scripts-0.2.0\venv\Scripts\accelerate.exemain.py”,第 7 行,在文件 “F:\lora-scripts-0.2.0\venv\lib\site-packages\accelerate\command\accelerate_cli.py”中,第 45 行,在主 args.func(args) 文件中 “F:\lora-scripts-0.2.0\venv\lib\site-packages\accelerate\command\launch.py”,第 1104 行,在 launch_command simple_launcher(args) 中文件“F:\lora-scripts-0.2.0\venv\lib\site-packages\accelerate\command\launch.py”,第 567 行,simple_launcher raise 子进程。CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['F:\lora-scripts-0.2.0\venv\Scripts\python.exe', './sd-scripts/train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=./sd-models/before.safetensors', '--train_data_dir=./train', '--output_dir=./output', '--logging_dir=./logs', '--resolution=512,768', '--network_module=networks.lora', '--max_train_epochs=20', '--learning_rate=1e-4', '--unet_lr=1e-4', '--text_encoder_lr=1e-5', '--lr_scheduler=cosine_with_restarts', '--lr_warmup_steps=0', '--lr_scheduler_num_cycles=1', '--network_dim=64', '--network_alpha=32', '--output_name=之后', '--train_batch_size=3', '--save_every_n_epochs=2', '--mixed_precision=fp16', '--save_precision=fp16', '--种子=1337', '--cache_latents', '--clip_skip=2', '--prior_loss_weight=1', '--max_token_length=225', '--caption_extension=.txt', '--save_model_as=安全张量', '--min_bucket_reso=256', '--max_bucket_reso=1024', '--xformers', '--shuffle_caption', '--reg_data_dir=./train/reg', '--use_8bit_adam', '--use_lion_optimizer']' 返回非零退出状态 1。火车完成

嗨,我知道怎么解决,只要把训练批次大小调小就没问题,我的3060ti最多能调到4

AniMoster commented 1 year ago

nothing worked, i tried everything. i still get the same error :(

Unactivate 8bit adams

What do you mean unactivate 8bit adams? Like selecting adams instead of 8bit adams or deleting it from kohya ss folder? If it's the latter, then please provide the folder directory where I have to delete it,

Currently, after replacing all 3 instances of train.util.py, I'm getting this error:

Traceback (most recent call last): File "C:\Users\name\Kohya\kohya_ss\train_network.py", line 16, in <module> import library.train_util as train_util File "C:\Users\name\Kohya\kohya_ss\library\train_util.py", line 59, in <module> from library.lpw_stable_diffusion import StableDiffusionLongPromptWeightingPipeline ModuleNotFoundError: No module named 'library.lpw_stable_diffusion' Traceback (most recent call last): File "C:\Users\name\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\name\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "C:\Users\name\Kohya\kohya_ss\venv\Scripts\accelerate.exe\__main__.py", line 7, in <module> File "C:\Users\name\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main args.func(args) File "C:\Users\name\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command simple_launcher(args) File "C:\Users\name\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['C:\\Users\\name\\Kohya\\kohya_ss\\venv\\Scripts\\python.exe', 'train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=C:/Users/name/stable-diffusion-webui/models/Stable-diffusion/chilloutmix_NiPrunedFp32Fix.safetensors', '--train_data_dir=C:/Users/name/Kohya/LoRA/img', '--resolution=512,512', '--output_dir=C:/Users/name/Kohya/LoRA/model', '--logging_dir=C:/Users/name/Kohya/LoRA/log', '--network_alpha=128', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-5', '--unet_lr=0.0001', '--network_dim=128', '--output_name=Ayane Sakura', '--lr_scheduler_num_cycles=1', '--learning_rate=0.0001', '--lr_scheduler=constant', '--train_batch_size=1', '--max_train_steps=4700', '--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--seed=1234', '--caption_extension=.txt', '--cache_latents', '--optimizer_type=AdamW', '--max_data_loader_n_workers=1', '--clip_skip=2', '--bucket_reso_steps=64', '--mem_eff_attn', '--gradient_checkpointing', '--xformers']' returned non-zero exit status 1.

BRuhICK commented 1 year ago

its still not working can someone help, i tried everything that was provided here

Validating that requirements are satisfied. All requirements satisfied. Load CSS... Running on local URL

To create a public link, set share=True in launch(). Loading config... Loading config... Loading config... Folder 100_johnny joestar: 18 images found Folder 100_johnny joestar: 1800 steps max_train_steps = 1800 stop_text_encoder_training = 0 lr_warmup_steps = 0 accelerate launch --num_cpu_threads_per_process=2 "train_network.py" --pretrained_model_name_or_path="C:/AI/stable-diffusion-webui/models/Stable-diffusion/sd-v1-4 .ckpt" --train_data_dir="C:/AI/SAMPLE IMAGES/Johnny Joestar/img" --resolution=512,512 --output_dir="C:/AI/SAMPLE IMAGES/Johnny Joestar/model" --logging_dir="C:/AI/SAMPLE IMAGES/Johnny Joestar/log" --network_alpha="1" --save_model_as=safetensors --network_module=networks.lora --text_encoder_lr=5e-5 --unet_lr=0.0001 --network_dim=8 --output_name="JohnnyJoestar" --lr_scheduler_num_cycles="1" --learning_rate="0.0001" --lr_scheduler="constant" --train_batch_size="1" --max_train_steps="1800" --save_every_n_epochs="1" --mixed_precision="fp16" --save_precision="fp16" --seed="1234" --caption_extension=".txt" --cache_latents --optimizer_type="AdamW" --max_data_loader_n_workers="1" --clip_skip=2 --bucket_reso_steps=64 --mem_eff_attn --gradient_checkpointing --xformers --bucket_no_upscale prepare tokenizer Use DreamBooth method. prepare images. found directory C:\AI\SAMPLE IMAGES\Johnny Joestar\img\100_johnny joestar contains 18 image files 1800 train images with repeating. 0 reg images. no regularization images / 正則化画像が見つかりませんでした [Dataset 0] batch_size: 1 resolution: (512, 512) enable_bucket: False

[Subset 0 of Dataset 0] image_dir: "C:\AI\SAMPLE IMAGES\Johnny Joestar\img\100_johnny joestar" image_count: 18 num_repeats: 100 shuffle_caption: False keep_tokens: 0 caption_dropout_rate: 0.0 caption_dropout_every_n_epoches: 0 caption_tag_dropout_rate: 0.0 color_aug: False flip_aug: False face_crop_aug_range: None random_crop: False token_warmup_min: 1, token_warmup_step: 0, is_reg: False class_tokens: johnny joestar caption_extension: .txt

[Dataset 0] loading image sizes. 100%|█████████████████████████████████████████████████████████████████████████████████| 18/18 [00:00<00:00, 270.98it/s] prepare dataset prepare accelerator Traceback (most recent call last): File "C:\AI\Kohya\kohya_ss\train_network.py", line 752, in train(args) File "C:\AI\Kohya\kohya_ss\train_network.py", line 140, in train accelerator, unwrap_model = train_util.prepare_accelerator(args) File "C:\AI\Kohya\kohya_ss\library\train_util.py", line 2692, in prepare_accelerator accelerator = Accelerator( File "C:\AI\Kohya\kohya_ss\venv\lib\site-packages\accelerate\accelerator.py", line 355, in init raise ValueError(err.format(mode="fp16", requirement="a GPU")) ValueError: fp16 mixed precision requires a GPU Traceback (most recent call last): File "C:\Users\Asus\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\Asus\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "C:\AI\Kohya\kohya_ss\venv\Scripts\accelerate.exe__main__.py", line 7, in File "C:\AI\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main args.func(args) File "C:\AI\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command simple_launcher(args) File "C:\AI\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['C:\AI\Kohya\kohya_ss\venv\Scripts\python.exe', 'train_network.py', '--pretrained_model_name_or_path=C:/AI/stable-diffusion-webui/models/Stable-diffusion/sd-v1-4 .ckpt', '--train_data_dir=C:/AI/SAMPLE IMAGES/Johnny Joestar/img', '--resolution=512,512', '--output_dir=C:/AI/SAMPLE IMAGES/Johnny Joestar/model', '--logging_dir=C:/AI/SAMPLE IMAGES/Johnny Joestar/log', '--network_alpha=1', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-5', '--unet_lr=0.0001', '--network_dim=8', '--output_name=JohnnyJoestar', '--lr_scheduler_num_cycles=1', '--learning_rate=0.0001', '--lr_scheduler=constant', '--train_batch_size=1', '--max_train_steps=1800', '--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--seed=1234', '--caption_extension=.txt', '--cache_latents', '--optimizer_type=AdamW', '--max_data_loader_n_workers=1', '--clip_skip=2', '--bucket_reso_steps=64', '--mem_eff_attn', '--gradient_checkpointing', '--xformers', '--bucket_no_upscale']' returned non-zero exit status 1.

the entire thing