Open FurkanGozukara opened 3 months ago
@bmaltais
And another error is this one. This randomly happens. After trying several time it starts working somehow
16:09:07-025674 INFO Loading config...
16:09:15-470598 INFO Save...
16:10:33-394977 INFO Save...
16:10:36-458435 INFO Start training Dreambooth...
16:10:36-462434 INFO Validating lr scheduler arguments...
16:10:36-463433 INFO Validating optimizer arguments...
16:10:36-463433 INFO Validating C:/test_kohya/DreamBooth existence and writability... SUCCESS
16:10:36-464433 INFO Validating C:/ComfyUI_windows_portable/ComfyUI/models/checkpoints/sd_xl_base_1.0.safetensors
existence... SUCCESS
16:10:36-466433 INFO Validating C:/Users/RENDA/Pictures/31maymodel\img existence... SUCCESS
16:10:36-466433 INFO Folder 1_ohwx style: 1 repeats found
16:10:36-467433 INFO Folder 1_ohwx style: 51 images found
16:10:36-468432 INFO Folder 1_ohwx style: 51 * 1 = 51 steps
16:10:36-468432 INFO Regulatization factor: 1
16:10:36-469433 INFO Total steps: 51
16:10:36-469433 INFO Train batch size: 1
16:10:36-470433 INFO Gradient accumulation steps: 1
16:10:36-470433 INFO Epoch: 150
16:10:36-471432 INFO max_train_steps (51 / 1 / 1 * 150 * 1) = 7650
16:10:36-471432 INFO lr_warmup_steps = 0
16:10:36-473433 INFO Saving training config to C:/test_kohya/DreamBooth\last_20240531-161036.json...
16:10:36-475434 INFO Executing command: C:\kohya_new\kohya_ss\venv\Scripts\accelerate.EXE launch --dynamo_backend no
--dynamo_mode default --mixed_precision bf16 --num_processes 1 --num_machines 1
--num_cpu_threads_per_process 2 C:/kohya_new/kohya_ss/sd-scripts/sdxl_train.py --config_file
C:/test_kohya/DreamBooth/config_dreambooth-20240531-161036.toml
16:10:36-478432 INFO Command executed.
2024-05-31 16:10:42 INFO Loading settings from train_util.py:3744
C:/test_kohya/DreamBooth/config_dreambooth-20240531-161036.toml...
INFO C:/test_kohya/DreamBooth/config_dreambooth-20240531-161036 train_util.py:3763
WARNING clip_skip will be unexpected / SDXL学習ではclip_skipは動作しません sdxl_train_util.py:343
2024-05-31 16:10:42 INFO prepare tokenizers sdxl_train_util.py:134
INFO update token length: 75 sdxl_train_util.py:159
INFO Using DreamBooth method. sdxl_train.py:144
Traceback (most recent call last):
File "C:\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Python310\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "C:\kohya_new\kohya_ss\venv\Scripts\accelerate.EXE\__main__.py", line 7, in <module>
File "C:\kohya_new\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 47, in main
args.func(args)
File "C:\kohya_new\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1017, in launch_command
simple_launcher(args)
File "C:\kohya_new\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 637, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['C:\\kohya_new\\kohya_ss\\venv\\Scripts\\python.exe', 'C:/kohya_new/kohya_ss/sd-scripts/sdxl_train.py', '--config_file', 'C:/test_kohya/DreamBooth/config_dreambooth-20240531-161036.toml']' returned non-zero exit status 3221225477.
I am getting the exact same error. Did you find a solution?
It works so many epochs but then randomly fails like below. Any ideas?
Windows 11, Python 3.10.11 fresh install
I think this is related to newest process calling system. This below failed training saved 2 checkpoints and trained 39 epoch before randomly failing. Random fails happens frequently.