bmaltais / kohya_ss

Apache License 2.0
9.54k stars 1.23k forks source link

On MacOS M1 Chip, deployed in anaconda,LoRA training crashing #576

Closed AnnaChen999 closed 8 months ago

AnnaChen999 commented 1 year ago

I am training the lora model on my MACOS M1, the following error occurs, I don't know what the error is, what should I do?

caching latents.
100%|█████████████████████████████████████████████████████████████████████████████| 40/40 [02:33<00:00,  3.85s/it]
import network module: networks.lora
create LoRA network. base dim (rank): 120, alpha: 128.0
create LoRA for Text Encoder: 72 modules.
create LoRA for U-Net: 192 modules.
enable LoRA for text encoder
enable LoRA for U-Net
prepare optimizer, data loader etc.
use AdamW optimizer | {}
Traceback (most recent call last):
  File "/Users/anna/kohya_ss/train_network.py", line 748, in <module>
    train(args)
  File "/Users/anna/kohya_ss/train_network.py", line 250, in train
    lr_scheduler = train_util.get_scheduler_fix(args, optimizer, accelerator.num_processes)
  File "/Users/anna/kohya_ss/library/train_util.py", line 2517, in get_scheduler_fix
    return wrap_check_needless_num_warmup_steps(schedule_func(optimizer))
  File "/Users/anna/kohya_ss/library/train_util.py", line 2489, in wrap_check_needless_num_warmup_steps
    raise ValueError(f"{name} does not require `num_warmup_steps`. Set None or 0.")
ValueError: SchedulerType.CONSTANT does not require `num_warmup_steps`. Set None or 0.
Traceback (most recent call last):
  File "/Users/anna/kohya_ss/venv/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/Users/anna/kohya_ss/venv/lib/python3.9/site-packages/accelerate/commands/accelerate_cli.py", line 45, in main
    args.func(args)
  File "/Users/anna/kohya_ss/venv/lib/python3.9/site-packages/accelerate/commands/launch.py", line 1104, in launch_command
    simple_launcher(args)
  File "/Users/anna/kohya_ss/venv/lib/python3.9/site-packages/accelerate/commands/launch.py", line 567, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/Users/anna/kohya_ss/venv/bin/python', 'train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=/Users/anna/Documents/sd/stable-diffusion-webui/models/Stable-diffusion/chilloutmix_NiPrunedFp32Fix.safetensors', '--train_data_dir=/Users/anna/kohya_ss/weds1_/image', '--resolution=512,512', '--output_dir=/Users/anna/kohya_ss/weds1_/model', '--logging_dir=/Users/anna/kohya_ss/weds1_/log', '--network_alpha=128', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-5', '--unet_lr=0.0001', '--network_dim=120', '--output_name=liziwei', '--lr_scheduler_num_cycles=1', '--learning_rate=0.0001', '--lr_scheduler=constant', '--lr_warmup_steps=120', '--train_batch_size=1', '--max_train_steps=1200', '--save_every_n_epochs=1', '--mixed_precision=no', '--save_precision=float', '--cache_latents', '--optimizer_type=AdamW', '--max_data_loader_n_workers=0', '--bucket_reso_steps=64', '--bucket_no_upscale']' returned non-zero exit status 1.
vkbest commented 1 year ago

Probably LR warmup (% of steps) in the training parameters, write 0

AnnaChen999 commented 1 year ago

Probably LR warmup (% of steps) in the training parameters, write 0

@vkbest Thanks for your answer, it solved my problem