bmaltais / kohya_ss

Apache License 2.0
9.53k stars 1.23k forks source link

both option use_8bit_adam and optimizer_type are specified - 8bit Adam bug #229

Closed mykeehu closed 1 year ago

mykeehu commented 1 year ago

I started the latest version with the usual parameters, but now I got an error. Seems there is something wrong with the optimizer? I tried, with AdamW and AdamW8bit, without success.

If I turned off the "Use 8bit adam" option and selected AdamW, it started. So either this option should be disabled or, in the case of AdamW, ignore it when compiling the command.

What is the difference between AdamW and AdamW8bit? If I choose the former, will it cause a burn-in (as if I've overtrain it)?

Replace CrossAttention.forward to use xformers
caching latents.
100%|██████████████████████████████████████████| 82/82 [00:10<00:00,  7.53it/s]
import network module: networks.lora
create LoRA for Text Encoder: 72 modules.
create LoRA for U-Net: 192 modules.
enable LoRA for text encoder
enable LoRA for U-Net
prepare optimizer, data loader etc.
Traceback (most recent call last):
  File "H:\Kohya-DB\kohya_ss\train_network.py", line 507, in <module>
    train(args)
  File "H:\Kohya-DB\kohya_ss\train_network.py", line 150, in train
    optimizer_name, optimizer_args, optimizer = train_util.get_optimizer(args, trainable_params)
  File "H:\Kohya-DB\kohya_ss\library\train_util.py", line 1536, in get_optimizer
    assert optimizer_type is None or optimizer_type == "", "both option use_8bit_adam and optimizer_type are specified / use_8bit_adamとoptimizer_typeの両方のオプションが指定されています"
AssertionError: both option use_8bit_adam and optimizer_type are specified / use_8bit_adamとoptimizer_typeの両方のオプションが指定されています
Traceback (most recent call last):
  File "C:\Users\Mykee\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\Mykee\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "H:\Kohya-DB\kohya_ss\venv\Scripts\accelerate.exe\__main__.py", line 7, in <module>
  File "H:\Kohya-DB\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main
    args.func(args)
  File "H:\Kohya-DB\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command
    simple_launcher(args)
  File "H:\Kohya-DB\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['H:\\Kohya-DB\\kohya_ss\\venv\\Scripts\\python.exe', 'train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5', '--train_data_dir=H:/Stable-Diffusion-Automatic/Dreambooth/Kohyatrain/ultimate downblouse/img-nodb-nositting-aesthetic-min', '--resolution=512,512', '--output_dir=H:/Stable-Diffusion-Automatic/Dreambooth/Kohyatrain/ultimate downblouse/model-nodb-nositting-aesthetic-min', '--logging_dir=H:/Stable-Diffusion-Automatic/Dreambooth/Kohyatrain/ultimate downblouse/log', '--network_alpha=128', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-5', '--unet_lr=0.0001', '--network_dim=128', '--output_name=ultimate-downblouse-10-aest', '--lr_scheduler_num_cycles=5', '--learning_rate=0.0001', '--lr_scheduler=constant', '--train_batch_size=1', '--max_train_steps=8000', '--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--seed=1234', '--caption_extension=.txt', '--cache_latents', '--optimizer_type=AdamW8bit', '--max_data_loader_n_workers=1', '--bucket_reso_steps=64', '--shuffle_caption', '--xformers', '--use_8bit_adam', '--bucket_no_upscale']' returned non-zero exit status 1.
NeoLoger commented 1 year ago

same exact issue, it started after the latest update I think.

Replace CrossAttention.forward to use xformers
caching latents.
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 498/498 [00:29<00:00, 17.12it/s]
import network module: networks.lora
create LoRA for Text Encoder: 72 modules.
create LoRA for U-Net: 192 modules.
enable LoRA for text encoder
enable LoRA for U-Net
prepare optimizer, data loader etc.
Traceback (most recent call last):
  File "S:\kohya\kohya_ss\train_network.py", line 507, in <module>
    train(args)
  File "S:\kohya\kohya_ss\train_network.py", line 150, in train
    optimizer_name, optimizer_args, optimizer = train_util.get_optimizer(args, trainable_params)
  File "S:\kohya\kohya_ss\library\train_util.py", line 1536, in get_optimizer
    assert optimizer_type is None or optimizer_type == "", "both option use_8bit_adam and optimizer_type are specified / use_8bit_adamとoptimizer_typeの両方のオプションが 指定されています"
AssertionError: both option use_8bit_adam and optimizer_type are specified / use_8bit_adamとoptimizer_typeの両方のオプションが指定されています
Traceback (most recent call last):
  File "C:\Users\Yvggeniy\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\Yvggeniy\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "S:\kohya\kohya_ss\venv\Scripts\accelerate.exe\__main__.py", line 7, in <module>
  File "S:\kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main
    args.func(args)
  File "S:\kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command
    simple_launcher(args)
  File "S:\kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['S:\\kohya\\kohya_ss\\venv\\Scripts\\python.exe', 'train_network.py', '--pretrained_model_name_or_path=S:/stable-diffusion/stable-diffusion-webui/models/Stable-diffusion/My_mixes/My_anyhentai_abyss_reva1_grape32_pov.safetensors', '--train_data_dir=V:/ImagesForSDTrining/Lora training/all the way through/image', '--resolution=512,512', '--output_dir=V:/ImagesForSDTrining/Lora training/all the way through/model', '--logging_dir=V:/ImagesForSDTrining/Lora training/all the way through/log', '--network_alpha=128', '--training_comment=Trained on My_anyhentai_abyss_reva1_grape32_pov', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-5', '--unet_lr=0.0001', '--network_dim=128', '--output_name=all the way through_v3', '--lr_scheduler_num_cycles=1', '--learning_rate=0.0001', '--lr_scheduler=constant', '--train_batch_size=2', '--max_train_steps=24900', '--save_every_n_epochs=1', '--mixed_precision=bf16', '--save_precision=bf16', '--seed=1234', '--caption_extension=.txt', '--cache_latents', '--optimizer_type=AdamW', '--max_data_loader_n_workers=1', '--clip_skip=2', '--bucket_reso_steps=64', '--xformers', '--use_8bit_adam', '--bucket_no_upscale']' returned non-zero exit status 1.
Throvtek commented 1 year ago

Exact same error here, first time using LORA and im getting this error, is there a way i can install an older version?

Rakshesha2024 commented 1 year ago

Try uncheck the "Use 8bit adam" button in Advanced Configuration

Throvtek commented 1 year ago

Try uncheck the "Use 8bit adam" button in Advanced Configuration

Its working now! Thank you very much Guyray!!!! how could i been so blind to not see that option, thanks man

bmaltais commented 1 year ago

Yeah... it is a config change issue introduced by the latest kohya_ss trainer update... I am not sure how best to fix this... I tought of removing the old 8bit checkbox but this would break the older config files... Perhaps I could implement some logic to disable the 8bit checkbox when a user select anything but the AdamW8bit option...

jiuxiaojian commented 1 year ago

AFTER unchecking the "Use 8bit adam" button in Advanced Configuration, I found it still doesn‘t work. I want to know how to fix it.

loading text encoder: <All keys matched successfully>
Replace CrossAttention.forward to use xformers
caching latents.
100%|████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:03<00:00,  2.29it/s]
import network module: networks.lora
create LoRA for Text Encoder: 72 modules.
create LoRA for U-Net: 192 modules.
enable LoRA for text encoder
enable LoRA for U-Net
prepare optimizer, data loader etc.
use AdamW optimizer | {}
Traceback (most recent call last):
  File "D:\AI Drawing\Lora\kohya_ss\train_network.py", line 507, in <module>
    train(args)
  File "D:\AI Drawing\Lora\kohya_ss\train_network.py", line 176, in train
    unet, text_encoder, network, optimizer, train_dataloader, lr_scheduler = accelerator.prepare(
  File "D:\AI Drawing\Lora\kohya_ss\venv\lib\site-packages\accelerate\accelerator.py", line 876, in prepare
    result = tuple(
  File "D:\AI Drawing\Lora\kohya_ss\venv\lib\site-packages\accelerate\accelerator.py", line 877, in <genexpr>
    self._prepare_one(obj, first_pass=True, device_placement=d) for obj, d in zip(args, device_placement)
  File "D:\AI Drawing\Lora\kohya_ss\venv\lib\site-packages\accelerate\accelerator.py", line 741, in _prepare_one
    return self.prepare_model(obj, device_placement=device_placement)
  File "D:\AI Drawing\Lora\kohya_ss\venv\lib\site-packages\accelerate\accelerator.py", line 912, in prepare_model
    model = model.to(self.device)
  File "D:\AI Drawing\Lora\kohya_ss\venv\lib\site-packages\transformers\modeling_utils.py", line 1749, in to
    return super().to(*args, **kwargs)
  File "D:\AI Drawing\Lora\kohya_ss\venv\lib\site-packages\torch\nn\modules\module.py", line 927, in to
    return self._apply(convert)
  File "D:\AI Drawing\Lora\kohya_ss\venv\lib\site-packages\torch\nn\modules\module.py", line 579, in _apply
    module._apply(fn)
  File "D:\AI Drawing\Lora\kohya_ss\venv\lib\site-packages\torch\nn\modules\module.py", line 579, in _apply
    module._apply(fn)
  File "D:\AI Drawing\Lora\kohya_ss\venv\lib\site-packages\torch\nn\modules\module.py", line 579, in _apply
    module._apply(fn)
  [Previous line repeated 3 more times]
  File "D:\AI Drawing\Lora\kohya_ss\venv\lib\site-packages\torch\nn\modules\module.py", line 602, in _apply
    param_applied = fn(param)
  File "D:\AI Drawing\Lora\kohya_ss\venv\lib\site-packages\torch\nn\modules\module.py", line 925, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 4.00 GiB total capacity; 3.42 GiB already allocated; 0 bytes free; 3.48 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Traceback (most recent call last):
  File "C:\Users\25424\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\25424\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "D:\AI Drawing\Lora\kohya_ss\venv\Scripts\accelerate.exe\__main__.py", line 7, in <module>
  File "D:\AI Drawing\Lora\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main
    args.func(args)
  File "D:\AI Drawing\Lora\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command
    simple_launcher(args)
  File "D:\AI Drawing\Lora\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['D:\\AI Drawing\\Lora\\kohya_ss\\venv\\Scripts\\python.exe', 'train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=D:/AI Drawing/stable-diffusion-webui/models/Stable-diffusion/chilloutmix_NiPrunedFp16Fix.safetensors', '--train_data_dir=D:/AI Drawing/Lora/Lora_database/shiya/image', '--resolution=512,512', '--output_dir=D:/AI Drawing/Lora/Lora_database/shiya/model', '--logging_dir=', '--network_alpha=1', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-5', '--unet_lr=0.0001', '--network_dim=8', '--output_name=last', '--lr_scheduler_num_cycles=1', '--learning_rate=0.0001', '--lr_scheduler=cosine', '--lr_warmup_steps=90', '--train_batch_size=1', '--max_train_steps=900', '--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--seed=1234', '--cache_latents', '--optimizer_type=AdamW', '--bucket_reso_steps=64', '--xformers', '--bucket_no_upscale']' returned non-zero exit status 1.
JasonTaverner commented 1 year ago

AFTER unchecking the "Use 8bit adam" button in Advanced Configuration, I found it still doesn‘t work. I want to know how to fix it.

loading text encoder: <All keys matched successfully>
Replace CrossAttention.forward to use xformers
caching latents.
100%|████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:03<00:00,  2.29it/s]
import network module: networks.lora
create LoRA for Text Encoder: 72 modules.
create LoRA for U-Net: 192 modules.
enable LoRA for text encoder
enable LoRA for U-Net
prepare optimizer, data loader etc.
use AdamW optimizer | {}
Traceback (most recent call last):
  File "D:\AI Drawing\Lora\kohya_ss\train_network.py", line 507, in <module>
    train(args)
  File "D:\AI Drawing\Lora\kohya_ss\train_network.py", line 176, in train
    unet, text_encoder, network, optimizer, train_dataloader, lr_scheduler = accelerator.prepare(
  File "D:\AI Drawing\Lora\kohya_ss\venv\lib\site-packages\accelerate\accelerator.py", line 876, in prepare
    result = tuple(
  File "D:\AI Drawing\Lora\kohya_ss\venv\lib\site-packages\accelerate\accelerator.py", line 877, in <genexpr>
    self._prepare_one(obj, first_pass=True, device_placement=d) for obj, d in zip(args, device_placement)
  File "D:\AI Drawing\Lora\kohya_ss\venv\lib\site-packages\accelerate\accelerator.py", line 741, in _prepare_one
    return self.prepare_model(obj, device_placement=device_placement)
  File "D:\AI Drawing\Lora\kohya_ss\venv\lib\site-packages\accelerate\accelerator.py", line 912, in prepare_model
    model = model.to(self.device)
  File "D:\AI Drawing\Lora\kohya_ss\venv\lib\site-packages\transformers\modeling_utils.py", line 1749, in to
    return super().to(*args, **kwargs)
  File "D:\AI Drawing\Lora\kohya_ss\venv\lib\site-packages\torch\nn\modules\module.py", line 927, in to
    return self._apply(convert)
  File "D:\AI Drawing\Lora\kohya_ss\venv\lib\site-packages\torch\nn\modules\module.py", line 579, in _apply
    module._apply(fn)
  File "D:\AI Drawing\Lora\kohya_ss\venv\lib\site-packages\torch\nn\modules\module.py", line 579, in _apply
    module._apply(fn)
  File "D:\AI Drawing\Lora\kohya_ss\venv\lib\site-packages\torch\nn\modules\module.py", line 579, in _apply
    module._apply(fn)
  [Previous line repeated 3 more times]
  File "D:\AI Drawing\Lora\kohya_ss\venv\lib\site-packages\torch\nn\modules\module.py", line 602, in _apply
    param_applied = fn(param)
  File "D:\AI Drawing\Lora\kohya_ss\venv\lib\site-packages\torch\nn\modules\module.py", line 925, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 4.00 GiB total capacity; 3.42 GiB already allocated; 0 bytes free; 3.48 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Traceback (most recent call last):
  File "C:\Users\25424\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\25424\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "D:\AI Drawing\Lora\kohya_ss\venv\Scripts\accelerate.exe\__main__.py", line 7, in <module>
  File "D:\AI Drawing\Lora\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main
    args.func(args)
  File "D:\AI Drawing\Lora\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command
    simple_launcher(args)
  File "D:\AI Drawing\Lora\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['D:\\AI Drawing\\Lora\\kohya_ss\\venv\\Scripts\\python.exe', 'train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=D:/AI Drawing/stable-diffusion-webui/models/Stable-diffusion/chilloutmix_NiPrunedFp16Fix.safetensors', '--train_data_dir=D:/AI Drawing/Lora/Lora_database/shiya/image', '--resolution=512,512', '--output_dir=D:/AI Drawing/Lora/Lora_database/shiya/model', '--logging_dir=', '--network_alpha=1', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-5', '--unet_lr=0.0001', '--network_dim=8', '--output_name=last', '--lr_scheduler_num_cycles=1', '--learning_rate=0.0001', '--lr_scheduler=cosine', '--lr_warmup_steps=90', '--train_batch_size=1', '--max_train_steps=900', '--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--seed=1234', '--cache_latents', '--optimizer_type=AdamW', '--bucket_reso_steps=64', '--xformers', '--bucket_no_upscale']' returned non-zero exit status 1.

Yes, I have the same error. I've tried with a low configuration (i have a 3080 10GB), but with 'Use 8bit adam' enabled I get the first error and if I don't have it enabled I get this error 'CUDA out of memory. Tried to allocate...'. It's strange because last night it was working and this morning it's not.

bmaltais commented 1 year ago

The issue is you need to use 8bit adam given the amount of VRAM you have. Make sure to only use AdamW8bit as the optimizer.

JasonTaverner commented 1 year ago

The issue is you need to use 8bit adam given the amount of VRAM you have. Make sure to only use AdamW8bit as the optimizer.

Thanks, it's working now. Just disabling 'Use 8bit adam' and AdamW8Bit in the 'optimizer' now works. And in case it helps, some clueless like me: I was doing it in 'dreamboth' instead of 'dreamboth LoRA'.

2blackbar commented 1 year ago

Wow, this breaks all trainings and yet the dev didnt noticed ? thanks for a fix

gekkomode commented 1 year ago

The issue is you need to use 8bit adam given the amount of VRAM you have. Make sure to only use AdamW8bit as the optimizer.

Thanks, it's working now. Just disabling 'Use 8bit adam' and AdamW8Bit in the 'optimizer' now works. And in case it helps, some clueless like me: I was doing it in 'dreamboth' instead of 'dreamboth LoRA'.

You are my hero. Thank you so much! This is for you: 🏆

Rakshesha2024 commented 1 year ago

Try uncheck the "Use 8bit adam" button in Advanced Configuration Its working now! Thank you very much Guyray!!!! how could i been so blind to not see that option, thanks man

You're welcome, brother. I'm glad I could help

mxharms commented 1 year ago

Yeah... it is a config change issue introduced by the latest kohya_ss trainer update... I am not sure how best to fix this... I tought of removing the old 8bit checkbox but this would break the older config files... Perhaps I could implement some logic to disable the 8bit checkbox when a user select anything but the AdamW8bit option...

Since the GUI (from what I understand) always passes an optimizer, wouldn't it be a better idea to remove that checkbox completely? And "migrate" existing configs to this new state? E.g. something like:

And then in any case, remove the checkbox from the GUI and config. Would that work?

ludwigjer commented 1 year ago

Anyone could run this using 8G VRAM after the update? I selected optimizer AdamW8bit but now it gives the out-of-memory error:

Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 8.00 GiB total capacity; 7.20 GiB already allocated; 0 bytes free; 7.29 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
mxharms commented 1 year ago

Anyone could run this using 8G VRAM after the update? I selected optimizer AdamW8bit but now it gives the out-of-memory error:

Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 8.00 GiB total capacity; 7.20 GiB already allocated; 0 bytes free; 7.29 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Are you by chance using the "Dreambooth" tab instead of the "Dreambooth LoRA" tab? That seems to constantly happen to a lot of people (including myself 😄).

ludwigjer commented 1 year ago

Thanks, @mxharms , you were right. By the way, there is another error, do you have any idea?

Traceback (most recent call last):
  File "J:\sd\kohya_ss\train_network.py", line 507, in <module>
    train(args)
  File "J:\sd\kohya_ss\train_network.py", line 135, in train
    network.load_weights(args.network_weights)
  File "J:\sd\kohya_ss\networks\lora.py", line 139, in load_weights
    self.weights_sd = torch.load(file, map_location='cpu')
  File "J:\sd\kohya_ss\venv\lib\site-packages\torch\serialization.py", line 713, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "J:\sd\kohya_ss\venv\lib\site-packages\torch\serialization.py", line 920, in _legacy_load
    magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, '{'.
Traceback (most recent call last):
  File "C:\Users\Luis\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\Luis\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "J:\sd\kohya_ss\venv\Scripts\accelerate.exe\__main__.py", line 7, in <module>
  File "J:\sd\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main
    args.func(args)
  File "J:\sd\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command
    simple_launcher(args)
  File "J:\sd\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
mxharms commented 1 year ago

Thanks, @mxharms , you were right. By the way, there is another error, do you have any idea?

Traceback (most recent call last):
  File "J:\sd\kohya_ss\train_network.py", line 507, in <module>
    train(args)
  File "J:\sd\kohya_ss\train_network.py", line 135, in train
    network.load_weights(args.network_weights)
  File "J:\sd\kohya_ss\networks\lora.py", line 139, in load_weights
    self.weights_sd = torch.load(file, map_location='cpu')
  File "J:\sd\kohya_ss\venv\lib\site-packages\torch\serialization.py", line 713, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "J:\sd\kohya_ss\venv\lib\site-packages\torch\serialization.py", line 920, in _legacy_load
    magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, '{'.
Traceback (most recent call last):
  File "C:\Users\Luis\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\Luis\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "J:\sd\kohya_ss\venv\Scripts\accelerate.exe\__main__.py", line 7, in <module>
  File "J:\sd\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main
    args.func(args)
  File "J:\sd\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command
    simple_launcher(args)
  File "J:\sd\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)

Sorry, don't really know much about Python or the internals of this. ☹️

ludwigjer commented 1 year ago

Thanks, @mxharms , you were right. By the way, there is another error, do you have any idea?

Traceback (most recent call last):
  File "J:\sd\kohya_ss\train_network.py", line 507, in <module>
    train(args)
  File "J:\sd\kohya_ss\train_network.py", line 135, in train
    network.load_weights(args.network_weights)
  File "J:\sd\kohya_ss\networks\lora.py", line 139, in load_weights
    self.weights_sd = torch.load(file, map_location='cpu')
  File "J:\sd\kohya_ss\venv\lib\site-packages\torch\serialization.py", line 713, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "J:\sd\kohya_ss\venv\lib\site-packages\torch\serialization.py", line 920, in _legacy_load
    magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, '{'.
Traceback (most recent call last):
  File "C:\Users\Luis\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\Luis\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "J:\sd\kohya_ss\venv\Scripts\accelerate.exe\__main__.py", line 7, in <module>
  File "J:\sd\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main
    args.func(args)
  File "J:\sd\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command
    simple_launcher(args)
  File "J:\sd\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)

Sorry, don't really know much about Python or the internals of this. ☹️

No problems :)

StudioCMC commented 1 year ago

This thread saved me my insanity, the 8 Bit switch, was throwing me off. Thank you for this post! :)

mtnmecca commented 1 year ago

This thread saved me my insanity, the 8 Bit switch, was throwing me off. Thank you for this post! :)

Same here. Thanks!!!

SwannSchilling commented 1 year ago

I still cannot train 8bit Adam...need to manually edit it out and run it in the terminal!

Ekaitza1985 commented 1 year ago

So. since the last update wich is the optimal config to train persons in lora? wich optimicer and with or without adam 8 bit? any1 could help me ? will be great, i am so confused atm

bmaltais commented 1 year ago

If your hardware permit it, train without 8bit... So AdamW. Otherwise use AdamW8bit.

bmaltais commented 1 year ago

Version 21.0.1 should now address this.

Ekaitza1985 commented 1 year ago

i am trying with 50 photos batch size 2 epoch 6 fp 16 (both) lr 0.0001 / constant / LR warmup 0 / optimizer adamW txt e LR 5e-5 and unet lr 0.0001 nothing more in advanced except xformers

since your last commit the process is working well but not the results as far as i know, 50 photos (100_namefolder) so: 50 100 5000, ((5000/2) 6) gives me = 15.000 steps on 4 safetensors and noone of them did a good inference of the model. Any idea why ? PD: and now my results have 9.326KB instead of the usual 147.572KB

mykeehu commented 1 year ago

Network values are low, set both to 128

Ekaitza1985 commented 1 year ago

Network values are low, set both to 128

since the last update, works like a charm!!! i'm so happy and ty for answer ^^!

udterry commented 1 year ago

last update, 8bit_adam cannot be used, AssertionError: both option use_8bit_adam and optimizer_type are specified / use_8bit_adamとoptimizer_typeの両方のオプションが指定されています

Ahoo-Mun commented 1 year ago

last update, 8bit_adam cannot be used, AssertionError: both option use_8bit_adam and optimizer_type are specified / use_8bit_adamとoptimizer_typeの両方のオプションが指定されています

Yeah I'm getting the same error even after updating.

noobbuddy commented 1 year ago

it still not working even after disabling adam8bit

Folder 100_test: 1100 steps max_train_steps = 1100 stop_text_encoder_training = 0 lr_warmup_steps = 110 accelerate launch --num_cpu_threads_per_process=2 "train_network.py" --enable_bucket --pretrained_model_name_or_path="D:/stable-diffusion/stable-diffusion-webui/models/Stable-diffusion/perfectWorld_perfectWorldBakedVAE.safetensors" --train_data_dir="D:/stable-diffusion/output/image" --resolution=512,512 --output_dir="D:/stable-diffusion/output/model" --logging_dir="D:/stable-diffusion/output/log" --network_alpha="1" --save_model_as=safetensors --network_module=networks.lora --text_encoder_lr=5e-5 --unet_lr=0.0001 --network_dim=8 --output_name="last" --lr_scheduler_num_cycles="1" --learning_rate="0.0001" --lr_scheduler="cosine" --lr_warmup_steps="110" --train_batch_size="1" --max_train_steps="1100" --save_every_n_epochs="1" --mixed_precision="fp16" --save_precision="fp16" --cache_latents --optimizer_type="DAdaptation" --bucket_reso_steps=64 --bucket_no_upscale prepare tokenizer Use DreamBooth method. Traceback (most recent call last): File "D:\stable-diffusion\kohya\kohya_ss\train_network.py", line 507, in <module> train(args) File "D:\stable-diffusion\kohya\kohya_ss\train_network.py", line 61, in train train_dataset = DreamBoothDataset(args.train_batch_size, args.train_data_dir, args.reg_data_dir, TypeError: DreamBoothDataset.__init__() takes 13 positional arguments but 21 were given Traceback (most recent call last): File "D:\python\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "D:\python\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "D:\stable-diffusion\kohya\kohya_ss\venv\Scripts\accelerate.exe\__main__.py", line 7, in <module> File "D:\stable-diffusion\kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main args.func(args) File "D:\stable-diffusion\kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command simple_launcher(args) File "D:\stable-diffusion\kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['D:\\stable-diffusion\\kohya\\kohya_ss\\venv\\Scripts\\python.exe', 'train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=D:/stable-diffusion/stable-diffusion-webui/models/Stable-diffusion/perfectWorld_perfectWorldBakedVAE.safetensors', '--train_data_dir=D:/stable-diffusion/output/image', '--resolution=512,512', '--output_dir=D:/stable-diffusion/output/model', '--logging_dir=D:/stable-diffusion/output/log', '--network_alpha=1', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-5', '--unet_lr=0.0001', '--network_dim=8', '--output_name=last', '--lr_scheduler_num_cycles=1', '--learning_rate=0.0001', '--lr_scheduler=cosine', '--lr_warmup_steps=110', '--train_batch_size=1', '--max_train_steps=1100', '--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--cache_latents', '--optimizer_type=DAdaptation', '--bucket_reso_steps=64', '--bucket_no_upscale']' returned non-zero exit status 1.

mtnmecca commented 1 year ago

I have gotten it to work after making sure the lower checkbox "Use 8bit adam" in advanced configuration is not selected even though "AdamW8bit" is selected under the "Optimizer" selection box.

noobbuddy commented 1 year ago

I give up I try to fix this issue for 3 weeks now and it still doesn't work

Ahoo-Mun commented 1 year ago

I'd recommend just using the Lion optimizer instead, works about the same but doesn't suffer from the same overtraining issues. Just make sure to divide your learning rate by 10

Havasiz commented 1 year ago

I cannot find the button to uncheck 8bit Adams? Did they remove it?

Havasiz commented 1 year ago

I'd recommend just using the Lion optimizer instead, works about the same but doesn't suffer from the same overtraining issues. Just make sure to divide your learning rate by 10

Hey, what do you mean by dividing the learning rate by 10? So if I have 25 images what do I put it and where?