kohya-ss / sd-scripts

Apache License 2.0
5.13k stars 855 forks source link

Error at generating a sample preview image in training Lora (MAC) #390

Closed vkbest closed 1 year ago

vkbest commented 1 year ago

I have configured the Mac and training but if I try to get preview every x steps or epochs I get this error:

generating sample images at step / サンプル画像生成 ステップ: 50
Traceback (most recent call last):
  File "/Users/myuser/Documents/kohya_ss/train_network.py", line 748, in <module>
    train(args)
  File "/Users/myuser/Documents/kohya_ss/train_network.py", line 616, in train
    train_util.sample_images(
  File "/Users/myuser/Documents/kohya_ss/library/train_util.py", line 2959, in sample_images
    cuda_rng_state = torch.cuda.get_rng_state()
  File "/Users/myuser/Documents/kohya_ss/venv/lib/python3.10/site-packages/torch/cuda/random.py", line 22, in get_rng_state
    _lazy_init()
  File "/Users/myuser/Documents/kohya_ss/venv/lib/python3.10/site-packages/torch/cuda/__init__.py", line 239, in _lazy_init
    raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled
steps:   4%|████████▍                                                                                                                                                                                         | 50/1150 [01:07<24:42,  1.35s/it, loss=0.0911]
Traceback (most recent call last):
  File "/Users/myuser/Documents/kohya_ss/venv/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/Users/myuser/Documents/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 45, in main
    args.func(args)
  File "/Users/myuser/Documents/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1104, in launch_command
    simple_launcher(args)
  File "/Users/myuser/Documents/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 567, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)

Looks like the preview sample part try to force making a preview in CUDA, despite I'm using MPS (metal performance shaders from Apple) for training.

mekaonee commented 1 year ago

@vkbest I have the same problem with my M1 Max 32GB. Did you find a solution?

Traceback (most recent call last): File "/Users/username/kohya_ss/venv/bin/accelerate", line 8, in sys.exit(main()) File "/Users/username/kohya_ss/venv/lib/python3.9/site-packages/accelerate/commands/accelerate_cli.py", line 45, in main args.func(args) File "/Users/username/kohya_ss/venv/lib/python3.9/site-packages/accelerate/commands/launch.py", line 1104, in launch_command simple_launcher(args) File "/Users/username/kohya_ss/venv/lib/python3.9/site-packages/accelerate/commands/launch.py", line 567, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['/Users/username/kohya_ss/venv/bin/python', 'train_network.py', '--v2', '--enable_bucket', '--pretrained_model_name_or_path=/Users/username/stable-diffusion-webui/models/Stable-diffusion/v2-1_512-ema-pruned.safetensors', '--train_data_dir=/Users/username/stable-diffusion-webui/mymodels/mertk/lora/img', '--resolution=512,512', '--output_dir=/Users/username/stable-diffusion-webui/mymodels/mertk/lora/model', '--logging_dir=/Users/username/stable-diffusion-webui/mymodels/mertk/lora/log', '--network_alpha=128', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-5', '--unet_lr=0.0001', '--network_dim=128', '--output_name=mertk', '--lr_scheduler_num_cycles=1', '--learning_rate=0.0001', '--lr_scheduler=constant', '--train_batch_size=2', '--max_train_steps=2900', '--save_every_n_epochs=1', '--mixed_precision=no', '--save_precision=float', '--seed=1234', '--caption_extension=.txt', '--cache_latents', '--optimizer_type=AdamW', '--max_data_loader_n_workers=1', '--clip_skip=2', '--bucket_reso_steps=64', '--xformers', '--bucket_no_upscale']' returned non-zero exit status 1.