Im having an issue when getting to the training steps, can anybody help?
2024-09-07 21:52:32 INFO move vae and unet back to original device flux_train_network.py:232
INFO create LoRA network. base dim (rank): 8, alpha: 1 lora.py:935
INFO neuron dropout: p=None, rank dropout: p=None, module dropout: p=None lora.py:936
INFO create LoRA for Text Encoder 1: lora.py:1027
INFO create LoRA for Text Encoder 2: lora.py:1027
INFO create LoRA for Text Encoder: 24 modules. lora.py:1035
INFO create LoRA for U-Net: 0 modules. lora.py:1043
INFO enable LoRA for text encoder: 24 modules lora.py:1084
INFO enable LoRA for U-Net: 0 modules lora.py:1089
FLUX: Gradient checkpointing enabled.
prepare optimizer, data loader etc.
INFO use 8-bit AdamW optimizer | {} train_util.py:4343
enable fp8 training for U-Net.
enable fp8 training for Text Encoder.
running training / 学習開始
num train images * repeats / 学習画像の数×繰り返し回数: 10
num reg images / 正則化画像の数: 0
num batches per epoch / 1epochのバッチ数: 10
num epochs / epoch数: 300
batch size per device / バッチサイズ: 1
gradient accumulation steps / 勾配を合計するステップ数 = 1
total optimization steps / 学習ステップ数: 3000
steps: 0%| | 0/3000 [00:00<?, ?it/s]2024-09-07 21:54:03 INFO unet dtype: torch.float8_e4m3fn, device: cuda:0 train_network.py:1046
INFO text_encoder [0] dtype: torch.float16, device: cuda:0 train_network.py:1052
INFO text_encoder [1] dtype: torch.float16, device: cpu train_network.py:1052
epoch 1/300
INFO epoch is incremented. current_epoch: 0, epoch: 1 train_util.py:668
C:\Users\User\kohya_ss\venv\lib\site-packages\transformers\models\clip\modeling_clip.py:480: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:555.)
attn_output = torch.nn.functional.scaled_dot_product_attention(
Traceback (most recent call last):
File "C:\Users\User\kohya_ss\sd-scripts\flux_train_network.py", line 446, in
trainer.train(args)
File "C:\Users\User\kohya_ss\sd-scripts\train_network.py", line 1141, in train
noise_pred, target, timesteps, huber_c, weighting = self.get_noise_pred_and_target(
File "C:\Users\User\kohya_ss\sd-scripts\flux_train_network.py", line 360, in get_noise_pred_and_target
assert network.train_blocks == "single", "train_blocks must be single for split mode"
File "C:\Users\User\kohya_ss\venv\lib\site-packages\torch\nn\modules\module.py", line 1729, in getattr
raise AttributeError(f"'{type(self).name}' object has no attribute '{name}'")
AttributeError: 'LoRANetwork' object has no attribute 'train_blocks'
steps: 0%| | 0/3000 [00:01<?, ?it/s]
Traceback (most recent call last):
File "C:\Users\User\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\User\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "C:\Users\User\kohya_ss\venv\Scripts\accelerate.EXE__main__.py", line 7, in
File "C:\Users\User\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 48, in main
args.func(args)
File "C:\Users\user\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1106, in launch_command
simple_launcher(args)
File "C:\Users\User\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 704, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['C:\Users\User\kohya_ss\venv\Scripts\python.exe', 'C:/Users/User/kohya_ss/sd-scripts/flux_train_network.py', '--config_file', 'O:\AI\train\SDXL_training\soft\flux\model/config_lora-20240907-214436.toml', '--cache_text_encoder_outputs']' returned non-zero exit status 1.
21:54:07-668388 INFO Training has ended.
Im having an issue when getting to the training steps, can anybody help?
2024-09-07 21:52:32 INFO move vae and unet back to original device flux_train_network.py:232 INFO create LoRA network. base dim (rank): 8, alpha: 1 lora.py:935 INFO neuron dropout: p=None, rank dropout: p=None, module dropout: p=None lora.py:936 INFO create LoRA for Text Encoder 1: lora.py:1027 INFO create LoRA for Text Encoder 2: lora.py:1027 INFO create LoRA for Text Encoder: 24 modules. lora.py:1035 INFO create LoRA for U-Net: 0 modules. lora.py:1043 INFO enable LoRA for text encoder: 24 modules lora.py:1084 INFO enable LoRA for U-Net: 0 modules lora.py:1089 FLUX: Gradient checkpointing enabled. prepare optimizer, data loader etc. INFO use 8-bit AdamW optimizer | {} train_util.py:4343 enable fp8 training for U-Net. enable fp8 training for Text Encoder. running training / 学習開始 num train images * repeats / 学習画像の数×繰り返し回数: 10 num reg images / 正則化画像の数: 0 num batches per epoch / 1epochのバッチ数: 10 num epochs / epoch数: 300 batch size per device / バッチサイズ: 1 gradient accumulation steps / 勾配を合計するステップ数 = 1 total optimization steps / 学習ステップ数: 3000 steps: 0%| | 0/3000 [00:00<?, ?it/s]2024-09-07 21:54:03 INFO unet dtype: torch.float8_e4m3fn, device: cuda:0 train_network.py:1046 INFO text_encoder [0] dtype: torch.float16, device: cuda:0 train_network.py:1052 INFO text_encoder [1] dtype: torch.float16, device: cpu train_network.py:1052
epoch 1/300 INFO epoch is incremented. current_epoch: 0, epoch: 1 train_util.py:668 C:\Users\User\kohya_ss\venv\lib\site-packages\transformers\models\clip\modeling_clip.py:480: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:555.) attn_output = torch.nn.functional.scaled_dot_product_attention( Traceback (most recent call last): File "C:\Users\User\kohya_ss\sd-scripts\flux_train_network.py", line 446, in
trainer.train(args)
File "C:\Users\User\kohya_ss\sd-scripts\train_network.py", line 1141, in train
noise_pred, target, timesteps, huber_c, weighting = self.get_noise_pred_and_target(
File "C:\Users\User\kohya_ss\sd-scripts\flux_train_network.py", line 360, in get_noise_pred_and_target
assert network.train_blocks == "single", "train_blocks must be single for split mode"
File "C:\Users\User\kohya_ss\venv\lib\site-packages\torch\nn\modules\module.py", line 1729, in getattr
raise AttributeError(f"'{type(self).name}' object has no attribute '{name}'")
AttributeError: 'LoRANetwork' object has no attribute 'train_blocks'
steps: 0%| | 0/3000 [00:01<?, ?it/s]
Traceback (most recent call last):
File "C:\Users\User\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\User\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "C:\Users\User\kohya_ss\venv\Scripts\accelerate.EXE__main__.py", line 7, in
File "C:\Users\User\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 48, in main
args.func(args)
File "C:\Users\user\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1106, in launch_command
simple_launcher(args)
File "C:\Users\User\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 704, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['C:\Users\User\kohya_ss\venv\Scripts\python.exe', 'C:/Users/User/kohya_ss/sd-scripts/flux_train_network.py', '--config_file', 'O:\AI\train\SDXL_training\soft\flux\model/config_lora-20240907-214436.toml', '--cache_text_encoder_outputs']' returned non-zero exit status 1.
21:54:07-668388 INFO Training has ended.