bmaltais / kohya_ss

Apache License 2.0
9.54k stars 1.23k forks source link

no matter what I try, my training model just outputs a .json file. ... #2038

Closed FLIPACADABRA closed 7 months ago

FLIPACADABRA commented 7 months ago

I've tried switching it from adam b bit to adafactor, still didn't work. Halllp!


10:19:08-855002 INFO Version: v22.6.2

10:19:08-862468 INFO nVidia toolkit detected 10:19:10-882883 INFO Torch 2.1.2+cu118 10:19:10-901592 INFO Torch backend: nVidia CUDA 11.8 cuDNN 8907 10:19:10-905293 INFO Torch detected GPU: Quadro RTX 4000 VRAM 8192 Arch (7, 5) Cores 40 10:19:10-906786 INFO Verifying modules installation status from requirements_windows_torch2.txt... 10:19:10-909773 INFO Installing package: torch==2.1.2+cu118 torchvision==0.16.2+cu118 torchaudio==2.1.2+cu118 --index-url https://download.pytorch.org/whl/cu118 10:19:14-645174 INFO Installing package: xformers==0.0.23.post1+cu118 --index-url https://download.pytorch.org/whl/cu118 10:19:17-218043 INFO Verifying modules installation status from requirements.txt... 10:19:22-378795 INFO headless: False 10:19:22-383309 INFO Load CSS... Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch(). 10:23:58-602779 INFO Start training Dreambooth... 10:23:58-604272 INFO Valid image folder names found in: C:/Users/18183/Desktop/Goose/images 10:23:58-605766 INFO Valid image folder names found in: C:/Users/18183/Desktop/Goose/regularization 10:23:58-608005 INFO Folder 30_cat : steps 1020 10:23:58-609499 INFO Regularisation images are used... Will double the number of steps required... 10:23:58-610246 INFO max_train_steps (1020 / 1 / 1 7 2) = 14280 10:23:58-612485 INFO stop_text_encoder_training = 0 10:23:58-614725 INFO lr_warmup_steps = 1428 10:23:58-615472 INFO Saving training config to C:/Users/18183/Desktop/Goose/model\MAGIC_20240308-102358.json... 10:23:58-617712 INFO accelerate launch --num_cpu_threads_per_process=2 "./train_db.py" --bucket_no_upscale --bucket_reso_steps=64 --cache_latents --enable_bucket --min_bucket_reso=256 --max_bucket_reso=2048 --learning_rate="1e-05" --learning_rate_te="1e-05" --logging_dir="C:/Users/18183/Desktop/Goose/log" --lr_scheduler="cosine" --lr_scheduler_num_cycles="7" --lr_warmup_steps="1428" --max_data_loader_n_workers="0" --resolution="512,512" --max_train_steps="14280" --mixed_precision="fp16" --optimizer_type="Adafactor" --output_dir="C:/Users/18183/Desktop/Goose/model" --output_name="MAGIC" --pretrained_model_name_or_path="C:/Users/18183/Desktop/SD-Forge/webui_forge_cu121_torch21/webu i/models/Stable-diffusion/v1-5-pruned" --reg_data_dir="C:/Users/18183/Desktop/Goose/regularization" --save_every_n_epochs="1" --save_model_as=safetensors --save_precision="fp16" --train_batch_size="1" --train_data_dir="C:/Users/18183/Desktop/Goose/images" --xformers A matching Triton is not available, some optimizations will not be enabled. Error caught was: No module named 'triton' 2024-03-08 10:24:09 INFO prepare tokenizer train_util.py:39592024-03-08 10:24:10 INFO prepare images. train_util.py:1469 INFO found directory C:\Users\18183\Desktop\Goose\images\30_cat contains 34 train_util.py:1432 image files WARNING No caption file found for 34 images. Training will continue without train_util.py:1459 captions for these images. If class token exists, it will be used. / 34枚の画像にキャプションファイルが見つかりませんでした。これらの画像につ いてはキャプションなしで学習を続行します。class tokenが存在する場合はそれを使います。 WARNING C:\Users\18183\Desktop\Goose\images\30_cat\image (10).JPG train_util.py:1466 WARNING C:\Users\18183\Desktop\Goose\images\30_cat\image (11).JPG train_util.py:1466 WARNING C:\Users\18183\Desktop\Goose\images\30_cat\image (12).JPG train_util.py:1466 WARNING C:\Users\18183\Desktop\Goose\images\30_cat\image (13).JPG train_util.py:1466 WARNING C:\Users\18183\Desktop\Goose\images\30_cat\image (14).JPG train_util.py:1466 WARNING C:\Users\18183\Desktop\Goose\images\30_cat\image (15).JPG... and 29 more train_util.py:1464 INFO found directory C:\Users\18183\Desktop\Goose\regularization\1_cat train_util.py:1432 contains 2332 image files WARNING No caption file found for 2332 images. Training will continue without train_util.py:1459 captions for these images. If class token exists, it will be used. / 2332枚の画像にキャプションファイルが見つかりませんでした。これらの画像に ついてはキャプションなしで学習を続行します。class tokenが存在する場合はそれを使います。 WARNING C:\Users\18183\Desktop\Goose\regularization\1_cat\reg (10).png train_util.py:1466 WARNING C:\Users\18183\Desktop\Goose\regularization\1_cat\reg (100).png train_util.py:1466 WARNING C:\Users\18183\Desktop\Goose\regularization\1_cat\reg (1000).png train_util.py:1466 WARNING C:\Users\18183\Desktop\Goose\regularization\1_cat\reg (1001).png train_util.py:1466 WARNING C:\Users\18183\Desktop\Goose\regularization\1_cat\reg (1002).png train_util.py:1466 WARNING C:\Users\18183\Desktop\Goose\regularization\1_cat\reg (1003).png... and train_util.py:1464 2327 more INFO 1020 train images with repeating. train_util.py:1508 INFO 2332 reg images. train_util.py:1511 WARNING some of reg images are not used / train_util.py:1513 正則化画像の数が多いので、一部使用されない正則化画像があります INFO [Dataset 0] config_util.py:544 batch_size: 1 resolution: (512, 512) enable_bucket: True network_multiplier: 1.0 min_bucket_reso: 256 max_bucket_reso: 2048 bucket_reso_steps: 64 bucket_no_upscale: True

                           [Subset 0 of Dataset 0]
                             image_dir: "C:\Users\18183\Desktop\Goose\images\30_cat"
                             image_count: 34
                             num_repeats: 30
                             shuffle_caption: False
                             keep_tokens: 0
                             keep_tokens_separator:
                             caption_dropout_rate: 0.0
                             caption_dropout_every_n_epoches: 0
                             caption_tag_dropout_rate: 0.0
                             caption_prefix: None
                             caption_suffix: None
                             color_aug: False
                             flip_aug: False
                             face_crop_aug_range: None
                             random_crop: False
                             token_warmup_min: 1,
                             token_warmup_step: 0,
                             is_reg: False
                             class_tokens: cat
                             caption_extension: .caption

                           [Subset 1 of Dataset 0]
                             image_dir: "C:\Users\18183\Desktop\Goose\regularization\1_cat"
                             image_count: 2332
                             num_repeats: 1
                             shuffle_caption: False
                             keep_tokens: 0
                             keep_tokens_separator:
                             caption_dropout_rate: 0.0
                             caption_dropout_every_n_epoches: 0
                             caption_tag_dropout_rate: 0.0
                             caption_prefix: None
                             caption_suffix: None
                             color_aug: False
                             flip_aug: False
                             face_crop_aug_range: None
                             random_crop: False
                             token_warmup_min: 1,
                             token_warmup_step: 0,
                             is_reg: True
                             class_tokens: cat
                             caption_extension: .caption

                INFO     [Dataset 0]                                                              config_util.py:550                    INFO     loading image sizes.                                                      train_util.py:794100%|████████████████████████████████████████████████████████████████████████████| 1054/1054 [00:00<00:00, 6952.27it/s]
                INFO     make buckets                                                              train_util.py:800                    WARNING  min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is   train_util.py:817                             set, because bucket reso is defined by image size automatically /
                         bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計
                         算されるため、min_bucket_resoとmax_bucket_resoは無視されます
                INFO     number of images (including repeats) /                                    train_util.py:846                             各bucketの画像枚数(繰り返し回数を含む)
                INFO     bucket 0: resolution (256, 512), count: 5                                 train_util.py:851                    INFO     bucket 1: resolution (384, 512), count: 30                                train_util.py:851                    INFO     bucket 2: resolution (512, 512), count: 2005                              train_util.py:851                    INFO     mean ar error (without repeats): 0.00039577909183178146                   train_util.py:856                    INFO     prepare accelerator                                                         train_db.py:101accelerator device: cuda
                INFO     loading model for process 0/1                                            train_util.py:4111                    INFO     load Diffusers pretrained models:                                        train_util.py:4072                             C:/Users/18183/Desktop/SD-Forge/webui_forge_cu121_torch21/webui/models/S
                         table-diffusion/v1-5-pruned

Traceback (most recent call last): File "C:\Users\18183\Desktop\SD-Forge\kohya_ss\train_db.py", line 504, in train(args) File "C:\Users\18183\Desktop\SD-Forge\kohya_ss\train_db.py", line 118, in train text_encoder, vae, unet, load_stable_diffusion_format = train_util.load_target_model(args, weight_dtype, accelerator) File "C:\Users\18183\Desktop\SD-Forge\kohya_ss\library\train_util.py", line 4113, in load_target_model text_encoder, vae, unet, load_stable_diffusion_format = _load_target_model( File "C:\Users\18183\Desktop\SD-Forge\kohya_ss\library\train_util.py", line 4074, in _load_target_model pipe = StableDiffusionPipeline.from_pretrained(name_or_path, tokenizer=None, safety_checker=None) File "C:\Users\18183\Desktop\SD-Forge\kohya_ss\venv\lib\site-packages\huggingface_hub\utils_validators.py", line 118, in _inner_fn return fn(*args, **kwargs) File "C:\Users\18183\Desktop\SD-Forge\kohya_ss\venv\lib\site-packages\diffusers\pipelines\pipeline_utils.py", line 1092, in from_pretrained raise ValueError( ValueError: The provided pretrained_model_name_or_path "C:/Users/18183/Desktop/SD-Forge/webui_forge_cu121_torch21/webui/models/Stable-diffusion/v1-5-pruned" is neither a valid local path nor a valid repo id. Please check the parameter. Traceback (most recent call last): File "C:\Users\18183\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\18183\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "C:\Users\18183\Desktop\SD-Forge\kohya_ss\venv\Scripts\accelerate.exe__main__.py", line 7, in File "C:\Users\18183\Desktop\SD-Forge\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 47, in main args.func(args) File "C:\Users\18183\Desktop\SD-Forge\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1017, in launch_command simple_launcher(args) File "C:\Users\18183\Desktop\SD-Forge\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 637, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['C:\Users\18183\Desktop\SD-Forge\kohya_ss\venv\Scripts\python.exe', './train_db.py', '--bucket_no_upscale', '--bucket_reso_steps=64', '--cache_latents', '--enable_bucket', '--min_bucket_reso=256', '--max_bucket_reso=2048', '--learning_rate=1e-05', '--learning_rate_te=1e-05', '--logging_dir=C:/Users/18183/Desktop/Goose/log', '--lr_scheduler=cosine', '--lr_scheduler_num_cycles=7', '--lr_warmup_steps=1428', '--max_data_loader_n_workers=0', '--resolution=512,512', '--max_train_steps=14280', '--mixed_precision=fp16', '--optimizer_type=Adafactor', '--output_dir=C:/Users/18183/Desktop/Goose/model', '--output_name=MAGIC', '--pretrained_model_name_or_path=C:/Users/18183/Desktop/SD-Forge/webui_forge_cu121_torch21/webui/models/Stable-diffusion/v1-5-pruned', '--reg_data_dir=C:/Users/18183/Desktop/Goose/regularization', '--save_every_n_epochs=1', '--save_model_as=safetensors', '--save_precision=fp16', '--train_batch_size=1', '--train_data_dir=C:/Users/18183/Desktop/Goose/images', '--xformers']' returned non-zero exit status 1. 1

bmaltais commented 7 months ago

Look like the path to the model you are trying to use does not contain the extension .safetensors

c:/Users/18183/Desktop/SD-Forge/webui_forge_cu121_torch21/webui/models/Stable-diffusion/v1-5-pruned

FLIPACADABRA commented 7 months ago

thank you! A couple other things were wrong but appreciate the help!