Linaqruf / kohya-trainer

Adapted from https://note.com/kohya_ss/n/nbf7ce8d80f29 for easier cloning
Apache License 2.0
1.84k stars 302 forks source link

Error #211

Closed gitaiuser20 closed 1 year ago

gitaiuser20 commented 1 year ago

i'm getting this error when i try to start training: Loading settings from /content/LoRA/config/config_file.toml... /content/LoRA/config/config_file prepare tokenizer update token length: 225 Train with captions. loading existing metadata: /content/LoRA/meta_lat.json metadata has bucket info, enable bucketing / メタデータにbucket情報があるためbucketを有効にします using bucket info in metadata / メタデータ内のbucket情報を使います [Dataset 0] batch_size: 7 resolution: (512, 512) enable_bucket: True min_bucket_reso: None max_bucket_reso: None bucket_reso_steps: None bucket_no_upscale: None

[Subset 0 of Dataset 0] image_dir: "/content/LoRA/train_data" image_count: 24 num_repeats: 10 shuffle_caption: True keep_tokens: 0 caption_dropout_rate: 0 caption_dropout_every_n_epoches: 0 caption_tag_dropout_rate: 0 color_aug: False flip_aug: False face_crop_aug_range: None random_crop: False token_warmup_min: 1, token_warmup_step: 0, metadata_file: /content/LoRA/meta_lat.json

[Dataset 0] loading image sizes. 100% 24/24 [00:00<00:00, 351969.57it/s] make buckets number of images (including repeats) / 各bucketの画像枚数(繰り返し回数を含む) bucket 0: resolution (384, 640), count: 40 bucket 1: resolution (448, 576), count: 80 bucket 2: resolution (512, 512), count: 20 bucket 3: resolution (576, 448), count: 40 bucket 4: resolution (640, 384), count: 40 bucket 5: resolution (704, 320), count: 20 mean ar error (without repeats): 0.0 prepare accelerator ╭───────────────────── Traceback (most recent call last) ──────────────────────╮ │ /content/kohya-trainer/train_network.py:773 in │ │ │ │ 770 │ args = parser.parse_args() │ │ 771 │ args = train_util.read_config_from_file(args, parser) │ │ 772 │ │ │ ❱ 773 │ train(args) │ │ 774 │ │ │ │ /content/kohya-trainer/train_network.py:140 in train │ │ │ │ 137 │ │ │ 138 │ # acceleratorを準備する │ │ 139 │ print("prepare accelerator") │ │ ❱ 140 │ accelerator, unwrap_model = train_util.prepare_accelerator(args) │ │ 141 │ is_main_process = accelerator.is_main_process │ │ 142 │ │ │ 143 │ # mixed precisionに対応した型を用意しておき適宜castする │ │ │ │ /content/kohya-trainer/library/train_util.py:2768 in prepare_accelerator │ │ │ │ 2765 │ accelerator = Accelerator( │ │ 2766 │ │ gradient_accumulation_steps=args.gradient_accumulation_steps, │ │ 2767 │ │ mixed_precision=args.mixed_precision, │ │ ❱ 2768 │ │ log_with=log_with, │ │ 2769 │ │ logging_dir=logging_dir, │ │ 2770 │ ) │ │ 2771 │ ╰──────────────────────────────────────────────────────────────────────────────╯ UnboundLocalError: local variable 'log_with' referenced before assignment ╭───────────────────── Traceback (most recent call last) ──────────────────────╮ │ /usr/local/bin/accelerate:8 in │ │ │ │ 5 from accelerate.commands.accelerate_cli import main │ │ 6 if name == 'main': │ │ 7 │ sys.argv[0] = re.sub(r'(-script.pyw|.exe)?$', '', sys.argv[0]) │ │ ❱ 8 │ sys.exit(main()) │ │ 9 │ │ │ │ /usr/local/lib/python3.9/dist-packages/accelerate/commands/accelerate_cli.py │ │ :45 in main │ │ │ │ 42 │ │ exit(1) │ │ 43 │ │ │ 44 │ # Run │ │ ❱ 45 │ args.func(args) │ │ 46 │ │ 47 │ │ 48 if name == "main": │ │ │ │ /usr/local/lib/python3.9/dist-packages/accelerate/commands/launch.py:1104 in │ │ launch_command │ │ │ │ 1101 │ elif defaults is not None and defaults.compute_environment == Com │ │ 1102 │ │ sagemaker_launcher(defaults, args) │ │ 1103 │ else: │ │ ❱ 1104 │ │ simple_launcher(args) │ │ 1105 │ │ 1106 │ │ 1107 def main(): │ │ │ │ /usr/local/lib/python3.9/dist-packages/accelerate/commands/launch.py:567 in │ │ simple_launcher │ │ │ │ 564 │ process = subprocess.Popen(cmd, env=current_env) │ │ 565 │ process.wait() │ │ 566 │ if process.returncode != 0: │ │ ❱ 567 │ │ raise subprocess.CalledProcessError(returncode=process.return │ │ 568 │ │ 569 │ │ 570 def multi_gpu_launcher(args): │ ╰──────────────────────────────────────────────────────────────────────────────╯ CalledProcessError: Command '['/usr/bin/python3', 'train_network.py', '--sample_prompts=/content/LoRA/config/sample_prompt.txt', '--config_file=/content/LoRA/config/config_file.toml']' returned non-zero exit status 1.

Linaqruf commented 1 year ago

which notebook is this?

gitaiuser20 commented 1 year ago

my bad, turns out i had a couple of webp files and forgot to clean up, it was on lora fine tuning