Linaqruf / kohya-trainer

Adapted from https://note.com/kohya_ss/n/nbf7ce8d80f29 for easier cloning
Apache License 2.0
1.83k stars 300 forks source link

LoRA-trainer-XL, error when training. #290

Open KarelAI opened 11 months ago

KarelAI commented 11 months ago

Loading settings from /content/LoRA/config/config_file.toml... /content/LoRA/config/config_file prepare tokenizers Downloading (…)olve/main/vocab.json: 100% 961k/961k [00:00<00:00, 3.71MB/s] Downloading (…)olve/main/merges.txt: 100% 525k/525k [00:00<00:00, 3.95MB/s] Downloading (…)cial_tokens_map.json: 100% 389/389 [00:00<00:00, 2.25MB/s] Downloading (…)okenizer_config.json: 100% 905/905 [00:00<00:00, 6.13MB/s] Downloading (…)olve/main/vocab.json: 100% 862k/862k [00:00<00:00, 4.26MB/s] Downloading (…)olve/main/merges.txt: 100% 525k/525k [00:00<00:00, 3.90MB/s] Downloading (…)cial_tokens_map.json: 100% 389/389 [00:00<00:00, 2.35MB/s] Downloading (…)okenizer_config.json: 100% 904/904 [00:00<00:00, 6.17MB/s] update token length: 225 Training with captions. ╭───────────────────── Traceback (most recent call last) ──────────────────────╮ │ /content/kohya-trainer/sdxl_train_network.py:174 in │ │ │ │ 171 │ args = train_util.read_config_from_file(args, parser) │ │ 172 │ │ │ 173 │ trainer = SdxlNetworkTrainer() │ │ ❱ 174 │ trainer.train(args) │ │ 175 │ │ │ │ /content/kohya-trainer/train_network.py:177 in train │ │ │ │ 174 │ │ │ │ │ } │ │ 175 │ │ │ │ │ 176 │ │ │ blueprint = blueprint_generator.generate(user_config, args │ │ ❱ 177 │ │ │ train_dataset_group = config_util.generate_dataset_group_b │ │ 178 │ │ else: │ │ 179 │ │ │ # use arbitrary dataset class │ │ 180 │ │ │ train_dataset_group = train_util.load_arbitrary_dataset(ar │ │ │ │ /content/kohya-trainer/library/config_util.py:426 in │ │ generate_dataset_group_by_blueprint │ │ │ │ 423 │ dataset_klass = FineTuningDataset │ │ 424 │ │ │ 425 │ subsets = [subset_klass(asdict(subset_blueprint.params)) for sub │ │ ❱ 426 │ dataset = dataset_klass(subsets=subsets, asdict(dataset_blueprin │ │ 427 │ datasets.append(dataset) │ │ 428 │ │ 429 # print info │ │ │ │ /content/kohya-trainer/library/train_util.py:1477 in init │ │ │ │ 1474 │ │ │ │ with open(subset.metadata_file, "rt", encoding="utf-8 │ │ 1475 │ │ │ │ │ metadata = json.load(f) │ │ 1476 │ │ │ else: │ │ ❱ 1477 │ │ │ │ raise ValueError(f"no metadata / メタデータファイルが │ │ 1478 │ │ │ │ │ 1479 │ │ │ if len(metadata) < 1: │ │ 1480 │ │ │ │ print(f"ignore subset with '{subset.metadata_file}': │ ╰──────────────────────────────────────────────────────────────────────────────╯ ValueError: no metadata / メタデータファイルがありません: /content/LoRA/meta_lat.json ╭───────────────────── Traceback (most recent call last) ──────────────────────╮ │ /usr/local/bin/accelerate:8 in │ │ │ │ 5 from accelerate.commands.accelerate_cli import main │ │ 6 if name == 'main': │ │ 7 │ sys.argv[0] = re.sub(r'(-script.pyw|.exe)?$', '', sys.argv[0]) │ │ ❱ 8 │ sys.exit(main()) │ │ 9 │ │ │ │ /usr/local/lib/python3.10/dist-packages/accelerate/commands/accelerate_cli.p │ │ y:45 in main │ │ │ │ 42 │ │ exit(1) │ │ 43 │ │ │ 44 │ # Run │ │ ❱ 45 │ args.func(args) │ │ 46 │ │ 47 │ │ 48 if name == "main": │ │ │ │ /usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py:918 in │ │ launch_command │ │ │ │ 915 │ elif defaults is not None and defaults.compute_environment == Comp │ │ 916 │ │ sagemaker_launcher(defaults, args) │ │ 917 │ else: │ │ ❱ 918 │ │ simple_launcher(args) │ │ 919 │ │ 920 │ │ 921 def main(): │ │ │ │ /usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py:580 in │ │ simple_launcher │ │ │ │ 577 │ process.wait() │ │ 578 │ if process.returncode != 0: │ │ 579 │ │ if not args.quiet: │ │ ❱ 580 │ │ │ raise subprocess.CalledProcessError(returncode=process.ret │ │ 581 │ │ else: │ │ 582 │ │ │ sys.exit(1) │ │ 583 │ ╰──────────────────────────────────────────────────────────────────────────────╯ CalledProcessError: Command '['/usr/bin/python3', 'sdxl_train_network.py', '--sample_prompts=/content/LoRA/config/sample_prompt.toml', '--config_file=/content/LoRA/config/config_file.toml', '--wandb_api_key=aac18f50f617dfc5da70cd13eccaf58a53595754']' returned non-zero exit status 1.

Copynoa commented 10 months ago

Have you solved it yet?