bmaltais / kohya_ss

Apache License 2.0
9.38k stars 1.21k forks source link

Error when creating a lora #2418

Closed clownmoney closed 4 months ago

clownmoney commented 4 months ago

This always happens when I start training


  num train images * repeats / 学習画像の数×繰り返し回数: 17
  num reg images / 正則化画像の数: 0
  num batches per epoch / 1epochのバッチ数: 8
  num epochs / epoch数: 50
  batch size per device / バッチサイズ: 3
  gradient accumulation steps / 勾配を合計するステップ数 = 1
  total optimization steps / 学習ステップ数: 400
steps:   0%|                                                                                   | 0/400 [00:00<?, ?it/s]Traceback (most recent call last):
  File "C:\Users\user\OneDrive\Documents\ai\train\kohya_ss\sd-scripts\sdxl_train_network.py", line 185, in <module>
    trainer.train(args)
  File "C:\Users\user\OneDrive\Documents\ai\train\kohya_ss\sd-scripts\train_network.py", line 755, in train
    accelerator.init_trackers(
  File "C:\Users\user\OneDrive\Documents\ai\train\kohya_ss\venv\lib\site-packages\accelerate\accelerator.py", line 619, in _inner
    return PartialState().on_main_process(function)(*args, **kwargs)
  File "C:\Users\user\OneDrive\Documents\ai\train\kohya_ss\venv\lib\site-packages\accelerate\accelerator.py", line 2331, in init_trackers
    tracker_init(project_name, self.logging_dir, **init_kwargs.get(str(tracker), {}))
  File "C:\Users\user\OneDrive\Documents\ai\train\kohya_ss\venv\lib\site-packages\accelerate\tracking.py", line 79, in execute_on_main_process
    return PartialState().on_main_process(function)(self, *args, **kwargs)
  File "C:\Users\user\OneDrive\Documents\ai\train\kohya_ss\venv\lib\site-packages\accelerate\tracking.py", line 184, in __init__
    from torch.utils import tensorboard
  File "C:\Users\user\OneDrive\Documents\ai\train\kohya_ss\venv\lib\site-packages\torch\utils\tensorboard\__init__.py", line 12, in <module>
    from .writer import FileWriter, SummaryWriter  # noqa: F401
  File "C:\Users\user\OneDrive\Documents\ai\train\kohya_ss\venv\lib\site-packages\torch\utils\tensorboard\writer.py", line 16, in <module>
    from ._embedding import get_embedding_info, make_mat, make_sprite, make_tsv, write_pbtxt
  File "C:\Users\user\OneDrive\Documents\ai\train\kohya_ss\venv\lib\site-packages\torch\utils\tensorboard\_embedding.py", line 9, in <module>
    _HAS_GFILE_JOIN = hasattr(tf.io.gfile, "join")
  File "C:\Users\user\OneDrive\Documents\ai\train\kohya_ss\venv\lib\site-packages\tensorboard\lazy.py", line 65, in __getattr__
    return getattr(load_once(self), attr_name)
AttributeError: module 'tensorflow' has no attribute 'io'
steps:   0%|                                                                                   | 0/400 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\user\OneDrive\Documents\ai\train\kohya_ss\venv\Scripts\accelerate.exe\__main__.py", line 7, in <module>
  File "C:\Users\user\OneDrive\Documents\ai\train\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 47, in main
    args.func(args)
  File "C:\Users\user\OneDrive\Documents\ai\train\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1017, in launch_command
    simple_launcher(args)
  File "C:\Users\user\OneDrive\Documents\ai\train\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 637, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['C:\\Users\\user\\OneDrive\\Documents\\ai\\train\\kohya_ss\\venv\\Scripts\\python.exe', 'C:/Users/user/OneDrive/Documents/ai/train/kohya_ss/sd-scripts/sdxl_train_network.py', '--config_file', './outputs/config_lora-20240429-132009.toml', '--optimizer_args', 'weight_decay=0.01', 'd_coef=1', 'use_bias_correction=True', 'safeguard_warmup=False', 'betas=0.9,0.99']' returned non-zero exit status 1.
13:20:46-700152 INFO     Training has ended.``
clownmoney commented 4 months ago

https://github.com/bmaltais/kohya_ss/issues/904#issuecomment-1575676026