RuntimeError: CUDA error: invalid argument

Basically everything freezes at this point and I'm not sure why.

Validating that requirements are satisfied. All requirements satisfied. Load CSS... Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch(). Folder 67_ToniRizzo woman: 30 images found Folder 67_ToniRizzo woman: 2010 steps max_train_steps = 1005 stop_text_encoder_training = 0 lr_warmup_steps = 0 accelerate launch --num_cpu_threads_per_process=2 "train_network.py" --pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5" --train_data_dir="E:\Documents\khoya models\ToniRizzo_Lora\image" --resolution=512,512 --output_dir="E:\Documents\khoya models\ToniRizzo_Lora\model" --logging_dir="E:\Documents\khoya models\ToniRizzo_Lora\log" --network_alpha="128" --save_model_as=safetensors --network_module=networks.lora --text_encoder_lr=5e-5 --unet_lr=0.0001 --network_dim=128 --output_name="TonniRizzo" --lr_scheduler_num_cycles="1" --learning_rate="0.0001" --lr_scheduler="constant" --train_batch_size="2" --max_train_steps="1005" --save_every_n_epochs="1" --mixed_precision="no" --save_precision="fp16" --caption_extension=".txt" --cache_latents --optimizer_type="AdamW" --max_data_loader_n_workers="1" --clip_skip=2 --bucket_reso_steps=64 --xformers --bucket_no_upscale The following values were not passed to accelerate launch and had defaults used instead: --num_processes was set to a value of 1 --num_machines was set to a value of 1 --mixed_precision was set to a value of 'no' --dynamo_backend was set to a value of 'no' To avoid this warning pass in values for each of the problematic parameters or run accelerate config. prepare tokenizer Use DreamBooth method. prepare images. found directory E:\Documents\khoya models\ToniRizzo_Lora\image\67_ToniRizzo woman contains 30 image files 2010 train images with repeating. 0 reg images. no regularization images / 正則化画像が見つかりませんでした [Dataset 0] batch_size: 2 resolution: (512, 512) enable_bucket: False

[Subset 0 of Dataset 0] image_dir: "E:\Documents\khoya models\ToniRizzo_Lora\image\67_ToniRizzo woman" image_count: 30 num_repeats: 67 shuffle_caption: False keep_tokens: 0 caption_dropout_rate: 0.0 caption_dropout_every_n_epoches: 0 caption_tag_dropout_rate: 0.0 color_aug: False flip_aug: False face_crop_aug_range: None random_crop: False token_warmup_min: 1, token_warmup_step: 0, is_reg: False class_tokens: ToniRizzo woman caption_extension: .txt

[Dataset 0] loading image sizes. 100%|█████████████████████████████████████████████████████████████████████████████████| 30/30 [00:00<00:00, 352.94it/s] prepare dataset prepare accelerator Using accelerator 0.15.0 or above. loading model for process 0/1 load Diffusers pretrained models Downloading (…)ain/model_index.json: 100%|█████████████████████████████████████████████| 543/543 [00:00<00:00, 543kB/s] safety_checker\model.safetensors not found Downloading (…)_encoder/config.json: 100%|█████████████████████████████████████████████| 617/617 [00:00<00:00, 309kB/s] Downloading (…)cheduler_config.json: 100%|█████████████████████████████████████████████| 308/308 [00:00<00:00, 308kB/s] Downloading (…)rocessor_config.json: 100%|█████████████████████████████████████████████| 342/342 [00:00<00:00, 171kB/s] Downloading (…)_checker/config.json: 100%|████████████████████████████████████████| 4.72k/4.72k [00:00<00:00, 2.36MB/s] Downloading (…)okenizer_config.json: 100%|█████████████████████████████████████████████| 806/806 [00:00<00:00, 806kB/s] Downloading (…)cial_tokens_map.json: 100%|█████████████████████████████████████████████| 472/472 [00:00<00:00, 472kB/s] Downloading (…)tokenizer/merges.txt: 100%|██████████████████████████████████████████| 525k/525k [00:00<00:00, 1.86MB/s] Downloading (…)tokenizer/vocab.json: 100%|████████████████████████████████████████| 1.06M/1.06M [00:00<00:00, 3.47MB/s] Downloading (…)d819/vae/config.json: 100%|█████████████████████████████████████████████| 547/547 [00:00<00:00, 273kB/s] Downloading (…)819/unet/config.json: 100%|█████████████████████████████████████████████| 743/743 [00:00<00:00, 743kB/s] Downloading (…)_pytorch_model.bin";: 100%|██████████████████████████████████████████| 335M/335M [00:22<00:00, 15.0MB/s] Downloading (…)_model.safetensors";: 100%|██████████████████████████████████████████| 335M/335M [00:34<00:00, 9.71MB/s] Downloading (…)"pytorch_model.bin";: 100%|██████████████████████████████████████████| 492M/492M [00:53<00:00, 9.20MB/s] Downloading (…)"model.safetensors";: 100%|██████████████████████████████████████████| 492M/492M [00:57<00:00, 8.62MB/s] Downloading (…)"pytorch_model.bin";: 100%|████████████████████████████████████████| 1.22G/1.22G [01:45<00:00, 11.5MB/s] Downloading (…)"model.safetensors";: 100%|████████████████████████████████████████| 1.22G/1.22G [01:58<00:00, 10.3MB/s] Downloading (…)_pytorch_model.bin";: 100%|████████████████████████████████████████| 3.44G/3.44G [03:08<00:00, 18.3MB/s] Downloading (…)_model.safetensors";: 100%|████████████████████████████████████████| 3.44G/3.44G [03:29<00:00, 16.4MB/s] Fetching 19 files: 100%|███████████████████████████████████████████████████████████████| 19/19 [03:30<00:00, 11.09s/it] E:\Documents\AI\Kohya\kohya_ss\venv\lib\site-packages\transformers\models\clip\feature_extraction_clip.py:28: FutureWarning: The class CLIPFeatureExtractor is deprecated and will be removed in version 5 of Transformers. Please use CLIPImageProcessor instead.torch_model.bin";: 100%|████████████████████████████████████████| 3.44G/3.44G [03:08<00:00, 27.7MB/s] warnings.warn(model.safetensors";: 73%|█████████████████████████████▏ | 2.51G/3.44G [03:07<00:44, 20.8MB/s] You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing safety_checker=None. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 . Replace CrossAttention.forward to use xformers [Dataset 0] caching latents. 100%|██████████████████████████████████████████████████████████████████████████████████| 30/30 [00:26<00:00, 1.12it/s] import network module: networks.lora create LoRA network. base dim (rank): 128, alpha: 128.0 create LoRA for Text Encoder: 72 modules. create LoRA for U-Net: 192 modules. enable LoRA for text encoder enable LoRA for U-Net prepare optimizer, data loader etc. use AdamW optimizer | {} running training / 学習開始 num train images * repeats / 学習画像の数×繰り返し回数: 2010 num reg images / 正則化画像の数: 0 num batches per epoch / 1epochのバッチ数: 1005 num epochs / epoch数: 1 batch size per device / バッチサイズ: 2 gradient accumulation steps / 勾配を合計するステップ数 = 1 total optimization steps / 学習ステップ数: 1005 steps: 0%| | 0/1005 [00:00<?, ?it/s]epoch 1/1 Traceback (most recent call last): File "E:\Documents\AI\Kohya\kohya_ss\train_network.py", line 724, in train(args) File "E:\Documents\AI\Kohya\kohya_ss\train_network.py", line 578, in train accelerator.backward(loss) File "E:\Documents\AI\Kohya\kohya_ss\venv\lib\site-packages\accelerate\accelerator.py", line 1316, in backward loss.backward(kwargs) File "E:\Documents\AI\Kohya\kohya_ss\venv\lib\site-packages\torch_tensor.py", line 396, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs) File "E:\Documents\AI\Kohya\kohya_ss\venv\lib\site-packages\torch\autograd__init.py", line 173, in backward Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass File "E:\Documents\AI\Kohya\kohya_ss\venv\lib\site-packages\torch\autograd\function.py", line 253, in apply return user_fn(self, *args) File "E:\Documents\AI\Kohya\kohya_ss\venv\lib\site-packages\xformers\ops.py", line 369, in backward ) = torch.ops.xformers.efficient_attention_backward_cutlass( File "E:\Documents\AI\Kohya\kohya_ss\venv\lib\site-packages\torch_ops.py", line 143, in call__ return self._op(*args, kwargs or {}) RuntimeError: CUDA error: invalid argument CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. steps: 0%| | 0/1005 [00:14<?, ?it/s] Traceback (most recent call last): File "C:\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "E:\Documents\AI\Kohya\kohya_ss\venv\Scripts\accelerate.exe__main__.py", line 7, in File "E:\Documents\AI\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main args.func(args) File "E:\Documents\AI\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command simple_launcher(args) File "E:\Documents\AI\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['E:\Documents\AI\Kohya\kohya_ss\venv\Scripts\python.exe', 'train_network.py', '--pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5', '--train_data_dir=E:\Documents\khoya models\ToniRizzo_Lora\image', '--resolution=512,512', '--output_dir=E:\Documents\khoya models\ToniRizzo_Lora\model', '--logging_dir=E:\Documents\khoya models\ToniRizzo_Lora\log', '--network_alpha=128', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-5', '--unet_lr=0.0001', '--network_dim=128', '--output_name=TonniRizzo', '--lr_scheduler_num_cycles=1', '--learning_rate=0.0001', '--lr_scheduler=constant', '--train_batch_size=2', '--max_train_steps=1005', '--save_every_n_epochs=1', '--mixed_precision=no', '--save_precision=fp16', '--caption_extension=.txt', '--cache_latents', '--optimizer_type=AdamW', '--max_data_loader_n_workers=1', '--clip_skip=2', '--bucket_reso_steps=64', '--xformers', '--bucket_no_upscale']' returned non-zero exit status 1.

Linaqruf / kohya-trainer

RuntimeError: CUDA error: invalid argument #182