kohya-ss / sd-scripts

Apache License 2.0
5.23k stars 866 forks source link

failed run kohya for training lora model #1226

Open afifulinuha opened 7 months ago

afifulinuha commented 7 months ago

hello guys, i have an issue running the kohya, please someone give me a solution for this, thank you

this is the error message:

Screenshot 2024-03-31 063842

  06:34:57-522654 INFO Start training LoRA Standard ... 06:34:57-524653 INFO Validating model file or folder path D:/Program files/kohya/training/base/v1-5-pruned.safetensors existence... 06:34:57-525653 INFO ...valid 06:34:57-526652 INFO Validating output_dir path D:/Program files/kohya/training/lora_sdxl\model existence... 06:34:57-527653 INFO ...valid 06:34:57-528651 INFO Validating train_data_dir path D:/Program files/kohya/training/lora_sdxl\img existence... 06:34:57-528651 INFO ...valid 06:34:57-529651 INFO Validating reg_data_dir path D:/Program files/kohya/stable-diffusion-regularization-images-main/demo/women/upscale existence... 06:34:57-530653 INFO ...valid 06:34:57-531663 INFO Validating logging_dir path D:/Program files/kohya/training/lora_sdxl\log existence... 06:34:57-531663 INFO ...valid 06:34:57-532652 INFO log_tracker_config not specified, skipping validation 06:34:57-533657 INFO resume not specified, skipping validation 06:34:57-533657 INFO vae not specified, skipping validation 06:34:57-534652 INFO lora_network_weights not specified, skipping validation 06:34:57-535655 INFO dataset_config not specified, skipping validation 06:34:57-537652 INFO Folder 40_4urel1emoeramans women: 40 images found 06:34:57-538677 INFO Folder 40_4urel1emoeramans women: 1600 steps 06:34:57-539664 WARNING Regularisation images are used... Will double the number of steps required... 06:34:57-540651 INFO Total steps: 1600 06:34:57-541656 INFO Train batch size: 8 06:34:57-542651 INFO Gradient accumulation steps: 3 06:34:57-544030 INFO Epoch: 10 06:34:57-544030 INFO Regulatization factor: 2 06:34:57-545035 INFO max_train_steps (1600 / 8 / 3 10 2) = 1334 06:34:57-546035 INFO stop_text_encoder_training = 0 06:34:57-547034 INFO lr_warmup_steps = 133 06:34:57-547034 INFO Can't use LR warmup with LR Scheduler constant... ignoring... 06:34:57-548035 INFO Saving training config to D:/Program files/kohya/training/lora_sdxl\model\4urel1emoeramans_20240331-063457.json... 06:34:57-551035 INFO accelerate launch --num_cpu_threads_per_process=2 "D:\Program files\kohya\kohya_ss/sd-scripts/train_network.py" --bucket_no_upscale --bucket_reso_steps=64 --cache_latents --cache_latents_to_disk --caption_extension=".txt" --clip_skip=2 --enable_bucket --min_bucket_reso=256 --max_bucket_reso=2048 --gradient_accumulation_steps=3 --gradient_checkpointing --learning_rate="0.0001" --logging_dir="D:/Program files/kohya/training/lora_sdxl\log" --lr_scheduler="constant" --lr_scheduler_num_cycles="10" --max_data_loader_n_workers="0" --max_grad_norm="1" --resolution=""768,768"" --max_train_steps="1334" --mem_eff_attn --mixed_precision="bf16" --network_alpha="1" --network_dim=256 --network_module=networks.lora --optimizer_args --cache_text_encoder_outputs --network_train_unet_only --bucket_reso_steps="32" --optimizer_type="AdamW8bit" --output_dir="D:/Program files/kohya/training/lora_sdxl\model" --output_name="4urel1emoeramans" --pretrained_model_name_or_path="D:/Program files/kohya/training/base/v1-5-pruned.safetensors" --reg_data_dir="D:/Program files/kohya/stable-diffusion-regularization-images-main/demo/women/upscale" --save_every_n_epochs="1" --save_model_as=safetensors --save_precision="bf16" --train_batch_size="8" --training_comment="4urel1emoeramans" --train_data_dir="D:/Program files/kohya/training/lora_sdxl\img" --v_parameterization --v2 --xformers A matching Triton is not available, some optimizations will not be enabled. Error caught was: No module named 'triton' usage: train_network.py [-h] [--console_log_level {DEBUG,INFO,WARNING,ERROR,CRITICAL}] [--console_log_file CONSOLE_LOG_FILE] [--console_log_simple] [--v2] [--v_parameterization] [--pretrained_model_name_or_path PRETRAINED_MODEL_NAME_OR_PATH] [--tokenizer_cache_dir TOKENIZER_CACHE_DIR] [--train_data_dir TRAIN_DATA_DIR] [--shuffle_caption] [--caption_separator CAPTION_SEPARATOR] [--caption_extension CAPTION_EXTENSION] [--caption_extention CAPTION_EXTENTION] [--keep_tokens KEEP_TOKENS] [--keep_tokens_separator KEEP_TOKENS_SEPARATOR] [--caption_prefix CAPTION_PREFIX] [--caption_suffix CAPTION_SUFFIX] [--color_aug] [--flip_aug] [--face_crop_aug_range FACE_CROP_AUG_RANGE] [--random_crop] [--debug_dataset] [--resolution RESOLUTION] [--cache_latents] [--vae_batch_size VAE_BATCH_SIZE] [--cache_latents_to_disk] [--enable_bucket] [--min_bucket_reso MIN_BUCKET_RESO] [--max_bucket_reso MAX_BUCKET_RESO] [--bucket_reso_steps BUCKET_RESO_STEPS] [--bucket_no_upscale] [--token_warmup_min TOKEN_WARMUP_MIN] [--token_warmup_step TOKEN_WARMUP_STEP] [--dataset_class DATASET_CLASS] [--caption_dropout_rate CAPTION_DROPOUT_RATE] [--caption_dropout_every_n_epochs CAPTION_DROPOUT_EVERY_N_EPOCHS] [--caption_tag_dropout_rate CAPTION_TAG_DROPOUT_RATE] [--reg_data_dir REG_DATA_DIR] [--in_json IN_JSON] [--dataset_repeats DATASET_REPEATS] [--output_dir OUTPUT_DIR] [--output_name OUTPUT_NAME] [--huggingface_repo_id HUGGINGFACE_REPO_ID] [--huggingface_repo_type HUGGINGFACE_REPO_TYPE] [--huggingface_path_in_repo HUGGINGFACE_PATH_IN_REPO] [--huggingface_token HUGGINGFACE_TOKEN] [--huggingface_repo_visibility HUGGINGFACE_REPO_VISIBILITY] [--save_state_to_huggingface] [--resume_from_huggingface] [--async_upload] [--save_precision {None,float,fp16,bf16}] [--save_every_n_epochs SAVE_EVERY_N_EPOCHS] [--save_every_n_steps SAVE_EVERY_N_STEPS] [--save_n_epoch_ratio SAVE_N_EPOCH_RATIO] [--save_last_n_epochs SAVE_LAST_N_EPOCHS] [--save_last_n_epochs_state SAVE_LAST_N_EPOCHS_STATE] [--save_last_n_steps SAVE_LAST_N_STEPS] [--save_last_n_steps_state SAVE_LAST_N_STEPS_STATE] [--save_state] [--resume RESUME] [--train_batch_size TRAIN_BATCH_SIZE] [--max_token_length {None,150,225}] [--mem_eff_attn] [--torch_compile] [--dynamo_backend {eager,aot_eager,inductor,aot_ts_nvfuser,nvprims_nvfuser,cudagraphs,ofi,fx2trt,onnxrt}] [--xformers] [--sdpa] [--vae VAE] [--max_train_steps MAX_TRAIN_STEPS] [--max_train_epochs MAX_TRAIN_EPOCHS] [--max_data_loader_n_workers MAX_DATA_LOADER_N_WORKERS] [--persistent_data_loader_workers] [--seed SEED] [--gradient_checkpointing] [--gradient_accumulation_steps GRADIENT_ACCUMULATION_STEPS] [--mixed_precision {no,fp16,bf16}] [--full_fp16] [--full_bf16] [--fp8_base] [--ddp_timeout DDP_TIMEOUT] [--ddp_gradient_as_bucket_view] [--ddp_static_graph] [--clip_skip CLIP_SKIP] [--logging_dir LOGGING_DIR] [--log_with {tensorboard,wandb,all}] [--log_prefix LOG_PREFIX] [--log_tracker_name LOG_TRACKER_NAME] [--wandb_run_name WANDB_RUN_NAME] [--log_tracker_config LOG_TRACKER_CONFIG] [--wandb_api_key WANDB_API_KEY] [--noise_offset NOISE_OFFSET] [--multires_noise_iterations MULTIRES_NOISE_ITERATIONS] [--ip_noise_gamma IP_NOISE_GAMMA] [--multires_noise_discount MULTIRES_NOISE_DISCOUNT] [--adaptive_noise_scale ADAPTIVE_NOISE_SCALE] [--zero_terminal_snr] [--min_timestep MIN_TIMESTEP] [--max_timestep MAX_TIMESTEP] [--lowram] [--highvram] [--sample_every_n_steps SAMPLE_EVERY_N_STEPS] [--sample_at_first] [--sample_every_n_epochs SAMPLE_EVERY_N_EPOCHS] [--sample_prompts SAMPLE_PROMPTS] [--sample_sampler {ddim,pndm,lms,euler,euler_a,heun,dpm_2,dpm_2_a,dpmsolver,dpmsolver++,dpmsingle,k_lms,k_euler,k_euler_a,k_dpm_2,k_dpm_2_a}] [--config_file CONFIG_FILE] [--output_config] [--metadata_title METADATA_TITLE] [--metadata_author METADATA_AUTHOR] [--metadata_description METADATA_DESCRIPTION] [--metadata_license METADATA_LICENSE] [--metadata_tags METADATA_TAGS] [--prior_loss_weight PRIOR_LOSS_WEIGHT] [--optimizer_type OPTIMIZER_TYPE] [--use_8bit_adam] [--use_lion_optimizer] [--learning_rate LEARNING_RATE] [--max_grad_norm MAX_GRAD_NORM] [--optimizer_args [OPTIMIZER_ARGS ...]] [--lr_scheduler_type LR_SCHEDULER_TYPE] [--lr_scheduler_args [LR_SCHEDULER_ARGS ...]] [--lr_scheduler LR_SCHEDULER] [--lr_warmup_steps LR_WARMUP_STEPS] [--lr_scheduler_num_cycles LR_SCHEDULER_NUM_CYCLES] [--lr_scheduler_power LR_SCHEDULER_POWER] [--dataset_config DATASET_CONFIG] [--min_snr_gamma MIN_SNR_GAMMA] [--scale_v_pred_loss_like_noise_pred] [--v_pred_like_loss V_PRED_LIKE_LOSS] [--debiased_estimation_loss] [--weighted_captions] [--no_metadata] [--save_model_as {None,ckpt,pt,safetensors}] [--unet_lr UNET_LR] [--text_encoder_lr TEXT_ENCODER_LR] [--network_weights NETWORK_WEIGHTS] [--network_module NETWORK_MODULE] [--network_dim NETWORK_DIM] [--network_alpha NETWORK_ALPHA] [--network_dropout NETWORK_DROPOUT] [--network_args [NETWORK_ARGS ...]] [--network_train_unet_only] [--network_train_text_encoder_only] [--training_comment TRAINING_COMMENT] [--dim_from_weights] [--scale_weight_norms SCALE_WEIGHT_NORMS] [--base_weights [BASE_WEIGHTS ...]] [--base_weights_multiplier [BASE_WEIGHTS_MULTIPLIER ...]] [--no_half_vae] train_network.py: error: unrecognized arguments: --cache_text_encoder_outputs Traceback (most recent call last): File "C:\Users\afifu\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\afifu\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "D:\Program files\kohya\kohya_ss\venv\Scripts\accelerate.exe__main__.py", line 7, in File "D:\Program files\kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 47, in main args.func(args) File "D:\Program files\kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1017, in launch_command simple_launcher(args) File "D:\Program files\kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 637, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['D:\Program files\kohya\kohya_ss\venv\Scripts\python.exe', 'D:\Program files\kohya\kohya_ss/sd-scripts/train_network.py', '--bucket_no_upscale', '--bucket_reso_steps=64', '--cache_latents', '--cache_latents_to_disk', '--caption_extension=.txt', '--clip_skip=2', '--enable_bucket', '--min_bucket_reso=256', '--max_bucket_reso=2048', '--gradient_accumulation_steps=3', '--gradient_checkpointing', '--learning_rate=0.0001', '--logging_dir=D:/Program files/kohya/training/lora_sdxl\log', '--lr_scheduler=constant', '--lr_scheduler_num_cycles=10', '--max_data_loader_n_workers=0', '--max_grad_norm=1', '--resolution=768,768', '--max_train_steps=1334', '--mem_eff_attn', '--mixed_precision=bf16', '--network_alpha=1', '--network_dim=256', '--network_module=networks.lora', '--optimizer_args', '--cache_text_encoder_outputs', '--network_train_unet_only', '--bucket_reso_steps=32', '--optimizer_type=AdamW8bit', '--output_dir=D:/Program files/kohya/training/lora_sdxl\model', '--output_name=4urel1emoeramans', '--pretrained_model_name_or_path=D:/Program files/kohya/training/base/v1-5-pruned.safetensors', '--reg_data_dir=D:/Program files/kohya/stable-diffusion-regularization-images-main/demo/women/upscale', '--save_every_n_epochs=1', '--save_model_as=safetensors', '--save_precision=bf16', '--train_batch_size=8', '--training_comment=4urel1emoeramans', '--train_data_dir=D:/Program files/kohya/training/lora_sdxl\img', '--v_parameterization', '--v2', '--xformers']' returned non-zero exit status 2.

kohya-ss commented 7 months ago

--cache_text_encoder_outputs is not supported in SD1/2 training. Please remove the option (from GUI or from the command line).

afifulinuha commented 7 months ago

--cache_text_encoder_outputs is not supported in SD1/2 training. Please remove the option (from GUI or from the command line).

i have removed it but still won't work, still wondering what the problem :(

To create a public link, set share=True in launch(). 13:55:22-083189 INFO Loading config... 13:57:22-132007 INFO Copy D:/Program files/kohya/training/model/aurelie/images to D:/Program files/kohya/training/lora_1.5\img/40_4urel1emoeramans woman... 13:57:22-906250 INFO Regularization images directory is missing... not copying regularisation images... 13:57:23-157522 INFO Done creating kohya_ss training folder structure at D:/Program files/kohya/training/lora_1.5... 13:57:36-464516 INFO Start training LoRA Standard ... 13:57:36-465514 INFO Validating model file or folder path D:/Program files/kohya/training/base/v1-5-pruned.safetensors existence... 13:57:36-466514 INFO ...valid 13:57:36-467521 INFO Validating output_dir path D:/Program files/kohya/training/lora_1.5\model existence... 13:57:36-468522 INFO ...valid 13:57:36-468522 INFO Validating train_data_dir path D:/Program files/kohya/training/lora_1.5\img existence... 13:57:36-469520 INFO ...valid 13:57:36-469520 INFO reg_data_dir not specified, skipping validation 13:57:36-470520 INFO Validating logging_dir path D:/Program files/kohya/training/lora_1.5\log existence... 13:57:36-472513 INFO ...valid 13:57:36-473514 INFO log_tracker_config not specified, skipping validation 13:57:36-474513 INFO resume not specified, skipping validation 13:57:36-474513 INFO vae not specified, skipping validation 13:57:36-475524 INFO lora_network_weights not specified, skipping validation 13:57:36-476525 INFO dataset_config not specified, skipping validation 13:57:36-478525 INFO Folder 40_4urel1emoeramans woman: 40 images found 13:57:36-478525 INFO Folder 40_4urel1emoeramans woman: 1600 steps 13:57:36-479523 INFO Total steps: 1600 13:57:36-480514 INFO Train batch size: 1 13:57:36-481513 INFO Gradient accumulation steps: 1 13:57:36-482513 INFO Epoch: 10 13:57:36-483514 INFO Regulatization factor: 1 13:57:36-483514 INFO max_train_steps (1600 / 1 / 1 10 1) = 16000 13:57:36-485515 INFO stop_text_encoder_training = 0 13:57:36-485515 INFO lr_warmup_steps = 1600 13:57:36-608851 INFO Saving training config to D:/Program files/kohya/training/lora_1.5\model\4urel1emoeramans_20240331-135736.json... 13:57:36-611853 INFO accelerate launch --num_cpu_threads_per_process=2 "D:\Program files\kohya\kohya_ss/sd-scripts/train_network.py" --network_train_unet_only --bucket_no_upscale --bucket_reso_steps=64 --cache_latents --cache_latents_to_disk --caption_extension=".txt" --enable_bucket --min_bucket_reso=256 --max_bucket_reso=2048 --full_bf16 --gradient_checkpointing --learning_rate="0.0001" --logging_dir="D:/Program files/kohya/training/lora_1.5\log" --lr_scheduler="cosine" --lr_scheduler_num_cycles="10" --lr_warmup_steps="1600" --max_data_loader_n_workers="0" --max_grad_norm="1" --resolution="1024,1024" --max_train_steps="16000" --mixed_precision="bf16" --network_alpha="1" --network_dim=8 --network_module=networks.lora --optimizer_args network_train_unet_only --bucket_reso_steps="32" --optimizer_type="AdamW8bit" --output_dir="D:/Program files/kohya/training/lora_1.5\model" --output_name="4urel1emoeramans" --pretrained_model_name_or_path="D:/Program files/kohya/training/base/v1-5-pruned.safetensors" --save_every_n_epochs="1" --save_model_as=safetensors --save_precision="bf16" --seed="1" --train_batch_size="1" --training_comment="4urel1emoeramans" --train_data_dir="D:/Program files/kohya/training/lora_1.5\img" --v_parameterization --v2 --xformers --sample_sampler=euler_a --sample_prompts="D:/Program files/kohya/training/lora_1.5\model\sample\prompt.txt" --sample_every_n_epochs=1 A matching Triton is not available, some optimizations will not be enabled. Error caught was: No module named 'triton' 2024-03-31 13:57:46 INFO prepare tokenizer train_util.py:3959 2024-03-31 13:57:47 INFO Using DreamBooth method. train_network.py:173 INFO prepare images. train_util.py:1469 INFO found directory D:\Program train_util.py:1432 files\kohya\training\lora_1.5\img\40_4urel1emoeramans woman contains 40 image files INFO 1600 train images with repeating. train_util.py:1508 INFO 0 reg images. train_util.py:1511 WARNING no regularization images / 正則化画像が見つかりませんでした train_util.py:1516 INFO [Dataset 0] config_util.py:544 batch_size: 1 resolution: (1024, 1024) enable_bucket: True network_multiplier: 1.0 min_bucket_reso: 256 max_bucket_reso: 2048 bucket_reso_steps: 32 bucket_no_upscale: True

                           [Subset 0 of Dataset 0]
                             image_dir: "D:\Program
                         files\kohya\training\lora_1.5\img\40_4urel1emoeramans woman"
                             image_count: 40
                             num_repeats: 40
                             shuffle_caption: False
                             keep_tokens: 0
                             keep_tokens_separator:
                             caption_dropout_rate: 0.0
                             caption_dropout_every_n_epoches: 0
                             caption_tag_dropout_rate: 0.0
                             caption_prefix: None
                             caption_suffix: None
                             color_aug: False
                             flip_aug: False
                             face_crop_aug_range: None
                             random_crop: False
                             token_warmup_min: 1,
                             token_warmup_step: 0,
                             is_reg: False
                             class_tokens: 4urel1emoeramans woman
                             caption_extension: .txt

                INFO     [Dataset 0]                                                              config_util.py:550
                INFO     loading image sizes.                                                      train_util.py:794

100%|█████████████████████████████████████████████████████████████████████████████████| 40/40 [00:00<00:00, 367.24it/s] INFO make buckets train_util.py:800 WARNING min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is train_util.py:817 set, because bucket reso is defined by image size automatically / bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計 算されるため、min_bucket_resoとmax_bucket_resoは無視されます INFO number of images (including repeats) / train_util.py:846 各bucketの画像枚数(繰り返し回数を含む) INFO bucket 0: resolution (352, 640), count: 40 train_util.py:851 INFO bucket 1: resolution (384, 448), count: 40 train_util.py:851 INFO bucket 2: resolution (384, 544), count: 40 train_util.py:851 INFO bucket 3: resolution (384, 576), count: 40 train_util.py:851 INFO bucket 4: resolution (384, 640), count: 120 train_util.py:851 INFO bucket 5: resolution (416, 608), count: 80 train_util.py:851 INFO bucket 6: resolution (416, 672), count: 40 train_util.py:851 INFO bucket 7: resolution (416, 704), count: 40 train_util.py:851 INFO bucket 8: resolution (416, 736), count: 40 train_util.py:851 INFO bucket 9: resolution (448, 544), count: 40 train_util.py:851 INFO bucket 10: resolution (448, 672), count: 40 train_util.py:851 INFO bucket 11: resolution (480, 640), count: 40 train_util.py:851 INFO bucket 12: resolution (480, 672), count: 40 train_util.py:851 INFO bucket 13: resolution (512, 704), count: 40 train_util.py:851 INFO bucket 14: resolution (544, 672), count: 40 train_util.py:851 INFO bucket 15: resolution (544, 736), count: 40 train_util.py:851 INFO bucket 16: resolution (544, 768), count: 40 train_util.py:851 INFO bucket 17: resolution (576, 800), count: 40 train_util.py:851 INFO bucket 18: resolution (576, 832), count: 40 train_util.py:851 INFO bucket 19: resolution (608, 736), count: 40 train_util.py:851 INFO bucket 20: resolution (672, 800), count: 40 train_util.py:851 INFO bucket 21: resolution (1024, 1024), count: 640 train_util.py:851 INFO mean ar error (without repeats): 0.008661231393120947 train_util.py:856 INFO preparing accelerator train_network.py:226 accelerator device: cuda INFO loading model for process 0/1 train_util.py:4111 INFO load StableDiffusion checkpoint: D:/Program train_util.py:4066 files/kohya/training/base/v1-5-pruned.safetensors 2024-03-31 13:57:48 INFO UNet2DConditionModel: 64, [5, 10, 20, 20], 1024, False, False original_unet.py:1387 Traceback (most recent call last): File "D:\Program files\kohya\kohya_ss\sd-scripts\train_network.py", line 1058, in trainer.train(args) File "D:\Program files\kohya\kohya_ss\sd-scripts\train_network.py", line 235, in train model_version, text_encoder, vae, unet = self.load_target_model(args, weight_dtype, accelerator) File "D:\Program files\kohya\kohya_ss\sd-scripts\train_network.py", line 103, in load_target_model textencoder, vae, unet, = train_util.load_target_model(args, weight_dtype, accelerator) File "D:\Program files\kohya\kohya_ss\sd-scripts\library\train_util.py", line 4113, in load_target_model text_encoder, vae, unet, load_stable_diffusion_format = _load_target_model( File "D:\Program files\kohya\kohya_ss\sd-scripts\library\train_util.py", line 4067, in _load_target_model text_encoder, vae, unet = model_util.load_models_from_stable_diffusion_checkpoint( File "D:\Program files\kohya\kohya_ss\sd-scripts\library\model_util.py", line 1008, in load_models_from_stable_diffusion_checkpoint info = unet.load_state_dict(converted_unet_checkpoint) File "D:\Program files\kohya\kohya_ss\venv\lib\site-packages\torch\nn\modules\module.py", line 2152, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for UNet2DConditionModel: size mismatch for down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]). size mismatch for down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]). size mismatch for down_blocks.0.attentions.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]). size mismatch for down_blocks.0.attentions.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]). size mismatch for down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]). size mismatch for down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]). size mismatch for down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]). size mismatch for down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]). size mismatch for down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]). size mismatch for down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]). size mismatch for down_blocks.2.attentions.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]). size mismatch for down_blocks.2.attentions.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]). size mismatch for up_blocks.2.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]). size mismatch for up_blocks.2.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]). size mismatch for up_blocks.2.attentions.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]). size mismatch for up_blocks.2.attentions.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]). size mismatch for up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]). size mismatch for up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]). size mismatch for up_blocks.3.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]). size mismatch for up_blocks.3.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]). size mismatch for up_blocks.3.attentions.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]). size mismatch for up_blocks.3.attentions.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]). size mismatch for up_blocks.3.attentions.2.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]). size mismatch for up_blocks.3.attentions.2.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]). size mismatch for mid_block.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]). size mismatch for mid_block.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]). Traceback (most recent call last): File "C:\Users\afifu\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\afifu\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "D:\Program files\kohya\kohya_ss\venv\Scripts\accelerate.exe__main__.py", line 7, in File "D:\Program files\kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 47, in main args.func(args) File "D:\Program files\kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1017, in launch_command simple_launcher(args) File "D:\Program files\kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 637, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['D:\Program files\kohya\kohya_ss\venv\Scripts\python.exe', 'D:\Program files\kohya\kohya_ss/sd-scripts/train_network.py', '--network_train_unet_only', '--bucket_no_upscale', '--bucket_reso_steps=64', '--cache_latents', '--cache_latents_to_disk', '--caption_extension=.txt', '--enable_bucket', '--min_bucket_reso=256', '--max_bucket_reso=2048', '--full_bf16', '--gradient_checkpointing', '--learning_rate=0.0001', '--logging_dir=D:/Program files/kohya/training/lora_1.5\log', '--lr_scheduler=cosine', '--lr_scheduler_num_cycles=10', '--lr_warmup_steps=1600', '--max_data_loader_n_workers=0', '--max_grad_norm=1', '--resolution=1024,1024', '--max_train_steps=16000', '--mixed_precision=bf16', '--network_alpha=1', '--network_dim=8', '--network_module=networks.lora', '--optimizer_args', 'network_train_unet_only', '--bucket_reso_steps=32', '--optimizer_type=AdamW8bit', '--output_dir=D:/Program files/kohya/training/lora_1.5\model', '--output_name=4urel1emoeramans', '--pretrained_model_name_or_path=D:/Program files/kohya/training/base/v1-5-pruned.safetensors', '--save_every_n_epochs=1', '--save_model_as=safetensors', '--save_precision=bf16', '--seed=1', '--train_batch_size=1', '--training_comment=4urel1emoeramans', '--train_data_dir=D:/Program files/kohya/training/lora_1.5\img', '--v_parameterization', '--v2', '--xformers', '--sample_sampler=euler_a', '--sample_prompts=D:/Program files/kohya/training/lora_1.5\model\sample\prompt.txt', '--sample_every_n_epochs=1']' returned non-zero exit status 1.

kohya-ss commented 7 months ago

Please remove --v2 and --v_parameterization options. They are for V2 models.

afifulinuha commented 7 months ago

Please remove --v2 and --v_parameterization options. They are for V2 models.

i have removed it and seems doing fine for a moment, and then got a error on this section

                INFO     loading model for process 0/1                                            train_util.py:4111
                INFO     load StableDiffusion checkpoint: D:/Program                              train_util.py:4066
                         files/kohya/training/base/v1-5-pruned.safetensors
                INFO     UNet2DConditionModel: 64, 8, 768, False, False                        original_unet.py:1387

2024-03-31 14:10:57 INFO loading u-net: model_util.py:1009 2024-03-31 14:11:03 INFO loading vae: model_util.py:1017 2024-03-31 14:11:13 INFO loading text encoder: model_util.py:1074 2024-03-31 14:11:14 INFO Enable xformers for U-Net train_util.py:2529 import network module: networks.lora 2024-03-31 14:11:15 INFO [Dataset 0] train_util.py:1948 INFO caching latents. train_util.py:915 INFO checking cache validity... train_util.py:925 100%|████████████████████████████████████████████████████████████████████████████████| 40/40 [00:00<00:00, 6665.56it/s] INFO caching latents... train_util.py:962 100%|██████████████████████████████████████████████████████████████████████████████████| 40/40 [00:17<00:00, 2.29it/s] 2024-03-31 14:11:33 INFO create LoRA network. base dim (rank): 8, alpha: 1.0 lora.py:811 INFO neuron dropout: p=None, rank dropout: p=None, module dropout: p=None lora.py:812 INFO create LoRA for Text Encoder: lora.py:906 INFO create LoRA for Text Encoder: 72 modules. lora.py:911 INFO create LoRA for U-Net: 192 modules. lora.py:919 INFO enable LoRA for U-Net lora.py:967 INFO CrossAttnDownBlock2D False -> True original_unet.py:1521 INFO CrossAttnDownBlock2D False -> True original_unet.py:1521 INFO CrossAttnDownBlock2D False -> True original_unet.py:1521 INFO DownBlock2D False -> True original_unet.py:1521 INFO UNetMidBlock2DCrossAttn False -> True original_unet.py:1521 INFO UpBlock2D False -> True original_unet.py:1521 INFO CrossAttnUpBlock2D False -> True original_unet.py:1521 INFO CrossAttnUpBlock2D False -> True original_unet.py:1521 INFO CrossAttnUpBlock2D False -> True original_unet.py:1521 prepare optimizer, data loader etc. Traceback (most recent call last): File "D:\Program files\kohya\kohya_ss\sd-scripts\train_network.py", line 1058, in trainer.train(args) File "D:\Program files\kohya\kohya_ss\sd-scripts\train_network.py", line 349, in train optimizer_name, optimizer_args, optimizer = train_util.get_optimizer(args, trainable_params) File "D:\Program files\kohya\kohya_ss\sd-scripts\library\train_util.py", line 3585, in get_optimizer key, value = arg.split("=") ValueError: not enough values to unpack (expected 2, got 1) Traceback (most recent call last): File "C:\Users\afifu\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\afifu\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "D:\Program files\kohya\kohya_ss\venv\Scripts\accelerate.exe__main__.py", line 7, in File "D:\Program files\kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 47, in main args.func(args) File "D:\Program files\kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1017, in launch_command simple_launcher(args) File "D:\Program files\kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 637, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['D:\Program files\kohya\kohya_ss\venv\Scripts\python.exe', 'D:\Program files\kohya\kohya_ss/sd-scripts/train_network.py', '--network_train_unet_only', '--bucket_no_upscale', '--bucket_reso_steps=64', '--cache_latents', '--cache_latents_to_disk', '--caption_extension=.txt', '--enable_bucket', '--min_bucket_reso=256', '--max_bucket_reso=2048', '--full_bf16', '--gradient_checkpointing', '--learning_rate=0.0001', '--logging_dir=D:/Program files/kohya/training/lora_1.5\log', '--lr_scheduler=cosine', '--lr_scheduler_num_cycles=10', '--lr_warmup_steps=1600', '--max_data_loader_n_workers=0', '--max_grad_norm=1', '--resolution=1024,1024', '--max_train_steps=16000', '--mixed_precision=bf16', '--network_alpha=1', '--network_dim=8', '--network_module=networks.lora', '--optimizer_args', 'network_train_unet_only', '--bucket_reso_steps=32', '--optimizer_type=AdamW8bit', '--output_dir=D:/Program files/kohya/training/lora_1.5\model', '--output_name=4urel1emoeramans', '--pretrained_model_name_or_path=D:/Program files/kohya/training/base/v1-5-pruned.safetensors', '--save_every_n_epochs=1', '--save_model_as=safetensors', '--save_precision=bf16', '--seed=1', '--train_batch_size=1', '--training_comment=4urel1emoeramans', '--train_data_dir=D:/Program files/kohya/training/lora_1.5\img', '--xformers', '--sample_sampler=euler_a', '--sample_prompts=D:/Program files/kohya/training/lora_1.5\model\sample\prompt.txt', '--sample_every_n_epochs=1']' returned non-zero exit status 1.

kohya-ss commented 7 months ago

--optimizer_args network_train_unet_only gives network_train_unet_only for --optimizer_args option. Please change --optimizer_args network_train_unet_only to --network_train_unet_only.

afifulinuha commented 7 months ago

--optimizer_args network_train_unet_only gives network_train_unet_only for --optimizer_args option. Please change --optimizer_args network_train_unet_only to --network_train_unet_only.

thank you, now its running, do you know why the training process is so slow?

this is my spec: 17:21:03-241521 INFO Kohya_ss GUI version: v23.0.15 17:21:03-995002 INFO Submodule initialized and updated. 17:21:04-002004 INFO nVidia toolkit detected 17:21:17-808981 INFO Torch 2.1.2+cu118 17:21:18-361982 INFO Torch backend: nVidia CUDA 11.8 cuDNN 8700 17:21:18-366997 INFO Torch detected GPU: NVIDIA GeForce RTX 3060 Laptop GPU VRAM 6144 Arch (8, 6) Cores 30 17:21:18-372982 INFO Python version is 3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]

                INFO     caching latents.                                                          train_util.py:915
                INFO     checking cache validity...                                                train_util.py:925

100%|█████████████████████████████████████████████████████████████████████████████████| 40/40 [00:00<00:00, 890.43it/s] INFO caching latents... train_util.py:962 0it [00:00, ?it/s] prepare optimizer, data loader etc. 2024-03-31 18:08:36 INFO use 8-bit AdamW optimizer | {} train_util.py:3621 running training / 学習開始 num train images * repeats / 学習画像の数×繰り返し回数: 1600 num reg images / 正則化画像の数: 0 num batches per epoch / 1epochのバッチ数: 1600 num epochs / epoch数: 10 batch size per device / バッチサイズ: 1 total train batch size (with parallel & distributed & accumulation) / 総バッチサイズ(並列学習、勾配合計含む): 1 gradient ccumulation steps / 勾配を合計するステップ数 = 1 total optimization steps / 学習ステップ数: 16000 steps: 0%| | 0/16000 [00:00<?, ?it/s] epoch 1/10 steps: 0%| | 1/16000 [00:16<74:53:20, 16.85s/it, avr_loss=0.0428]

kohya-ss commented 7 months ago

I think 6GB VRAM is on the edge... The resolution 768x768 is too large for SD1.5, 512x512 will be better, and can reduce VRAM usage.

rockerBOO commented 7 months ago

steps: 0%| | 0/16000 [00:00<?, ?it/s] epoch 1/10 steps: 0%| | 1/16000 [00:16<74:53:20, 16.85s/it, avr_loss=0.0428]

16s/it will be a hint that it might be using shared GPU memory which uses your system memory and will be signficantly slower. Getting it closer to 1-3s/it is more likely so lower the VRAM usage by lowering the resolution like kohya-ss has suggested. You can check shared GPU usage in the task manager.