Closed Cookie4Free closed 1 month ago
Hello, seems your using the version of the main branch. Am I right?
To be honest, I don't really know—I think so. I followed the instructions in readme on how to install it.
Ok. Try to install the dev branch. git clone -b dev https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
The time has significantly dropped, thanks. But about the other errors, do I have to do anything about them? And in general, are there any good guides for how to do the settings the best way or presets that I can download?
steps: 0%| | 0/2460 [00:00<?, ?it/s]
epoch 1/10
A matching Triton is not available, some optimizations will not be enabled
Traceback (most recent call last):
File "B:\Stable_Diffusion\LoRA_Easy_Training_Scripts\sd_scripts\venv\lib\site-packages\xformers\__init__.py", line 55, in _is_triton_available
from xformers.triton.softmax import softmax as triton_softmax # noqa
File "B:\Stable_Diffusion\LoRA_Easy_Training_Scripts\sd_scripts\venv\lib\site-packages\xformers\triton\softmax.py", line 11, in <module>
import triton
ModuleNotFoundError: No module named 'triton'
B:\Stable_Diffusion\LoRA_Easy_Training_Scripts\sd_scripts\venv\lib\site-packages\diffusers\models\attention_processor.py:1039: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:455.)
hidden_states = F.scaled_dot_product_attention(
NaN found in latents, replacing with zeros
B:\Stable_Diffusion\LoRA_Easy_Training_Scripts\sd_scripts\venv\lib\site-packages\torch\autograd\graph.py:744: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ..\aten\src\ATen\native\cudnn\Conv_v8.cpp:919.)
return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
steps: 0%| | 1/2460 [00:33<23:10:26, 33.93s/it, avr_loss=0.0107]NaN found in latents, replacing with zeros
steps: 0%| | 2/2460 [01:06<22:41:44, 33.24s/it, avr_loss=nan]
After some time I get this errors aswell. Can this be ignored?
steps: 1%|▍ | 19/2460 [10:09<21:44:37, 32.07s/it, avr_loss=nan]NaN found in latents, replacing with zeros
Traceback (most recent call last):
File "B:\Stable_Diffusion\LoRA_Easy_Training_Scripts\sd_scripts\sdxl_train_network.py", line 189, in <module>
trainer.train(args)
File "B:\Stable_Diffusion\LoRA_Easy_Training_Scripts\sd_scripts\train_network.py", line 781, in train
noise_pred = self.call_unet(
File "B:\Stable_Diffusion\LoRA_Easy_Training_Scripts\sd_scripts\sdxl_train_network.py", line 169, in call_unet
noise_pred = unet(noisy_latents, timesteps, text_embedding, vector_embedding)
File "B:\Stable_Diffusion\LoRA_Easy_Training_Scripts\sd_scripts\venv\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "B:\Stable_Diffusion\LoRA_Easy_Training_Scripts\sd_scripts\venv\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "B:\Stable_Diffusion\LoRA_Easy_Training_Scripts\sd_scripts\venv\lib\site-packages\accelerate\utils\operations.py", line 636, in forward
return model_forward(*args, **kwargs)
File "B:\Stable_Diffusion\LoRA_Easy_Training_Scripts\sd_scripts\venv\lib\site-packages\accelerate\utils\operations.py", line 624, in __call__
return convert_to_fp32(self.model_forward(*args, **kwargs))
File "B:\Stable_Diffusion\LoRA_Easy_Training_Scripts\sd_scripts\venv\lib\site-packages\torch\amp\autocast_mode.py", line 16, in decorate_autocast
return func(*args, **kwargs)
File "B:\Stable_Diffusion\LoRA_Easy_Training_Scripts\sd_scripts\library\sdxl_original_unet.py", line 1106, in forward
h = call_module(module, h, emb, context)
File "B:\Stable_Diffusion\LoRA_Easy_Training_Scripts\sd_scripts\library\sdxl_original_unet.py", line 1090, in call_module
x = layer(x, context)
File "B:\Stable_Diffusion\LoRA_Easy_Training_Scripts\sd_scripts\venv\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "B:\Stable_Diffusion\LoRA_Easy_Training_Scripts\sd_scripts\venv\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "B:\Stable_Diffusion\LoRA_Easy_Training_Scripts\sd_scripts\library\sdxl_original_unet.py", line 745, in forward
hidden_states = block(hidden_states, context=encoder_hidden_states, timestep=timestep)
File "B:\Stable_Diffusion\LoRA_Easy_Training_Scripts\sd_scripts\venv\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "B:\Stable_Diffusion\LoRA_Easy_Training_Scripts\sd_scripts\venv\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "B:\Stable_Diffusion\LoRA_Easy_Training_Scripts\sd_scripts\library\sdxl_original_unet.py", line 668, in forward
output = self.forward_body(hidden_states, context, timestep)
File "B:\Stable_Diffusion\LoRA_Easy_Training_Scripts\sd_scripts\library\sdxl_original_unet.py", line 643, in forward_body
hidden_states = self.attn1(norm_hidden_states) + hidden_states
File "B:\Stable_Diffusion\LoRA_Easy_Training_Scripts\sd_scripts\venv\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "B:\Stable_Diffusion\LoRA_Easy_Training_Scripts\sd_scripts\venv\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "B:\Stable_Diffusion\LoRA_Easy_Training_Scripts\sd_scripts\library\sdxl_original_unet.py", line 453, in forward
hidden_states = self._attention(query, key, value)
File "B:\Stable_Diffusion\LoRA_Easy_Training_Scripts\sd_scripts\library\sdxl_original_unet.py", line 475, in _attention
attention_probs = attention_probs.to(value.dtype)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 640.00 MiB. GPU
steps: 1%|▍ | 19/2460 [10:31<22:32:09, 33.24s/it, avr_loss=nan]
Failed to train because of error:
Command '['B:\\Stable_Diffusion\\LoRA_Easy_Training_Scripts\\sd_scripts\\venv\\Scripts\\python.exe', 'sd_scripts\\sdxl_train_network.py', '--config_file=runtime_store\\config.toml', '--dataset_config=runtime_store\\dataset.toml']' returned non-zero exit status 1.
Ok, you said your GPU is a RTX 4090 right?
avg_loss=nan
while training it means the LoRA is ruined.
[[subsets]]
num_repeats = 2
caption_extension = ".txt"
shuffle_caption = false
flip_aug = false
color_aug = false
random_crop = false
is_reg = false
image_dir = "B:/Stable_Diffusion/Lora training/nanoless/dataset"
keep_tokens = 0
[noise_args]
[sample_args]
[logging_args]
[general_args.args] pretrained_model_name_or_path = "B:/Stable_Diffusion/Data/checkpoints/PonyDiffusionV6XL_SDXL.safetensors" mixed_precision = "fp16" seed = 23 max_data_loader_n_workers = 1 persistent_data_loader_workers = true max_token_length = 225 prior_loss_weight = 1.0 sdxl = true max_train_epochs = 10 full_bf16 = false full_fp16 = true vae = "B:/Stable_Diffusion/Data/VAE/sdxl_vae.safetensors"
[general_args.dataset_args] resolution = [ 1024, 1024,] batch_size = 2
[network_args.args] network_dim = 16 network_alpha = 8.0 min_timestep = 0 max_timestep = 1000
[optimizer_args.args] optimizer_type = "AdamW" lr_scheduler = "cosine" learning_rate = 0.0001 max_grad_norm = 1.0 warmup_ratio = 0.05 min_snr_gamma = 5
[saving_args.args] output_dir = "B:/Stable_Diffusion/Lora training/nanoless/output" save_precision = "fp16" save_model_as = "safetensors"
[bucket_args.dataset_args] enable_bucket = true min_bucket_reso = 256 max_bucket_reso = 1024 bucket_reso_steps = 64
[network_args.args.network_args] conv_dim = 4 conv_alpha = 4.0 algo = "locon"
[optimizer_args.args.optimizer_args] weight_decay = "0.1" betas = "0.9,0.99"
What are you targeting to train? A character, style, concept?
I wanted to train a style. I also wanted to create a character later on after this project.
Do you have Discord?
Yes, I do. Same name as here: Cookie4Free
Did it get solved?
Yes, so installing the dev branch git clone https://github.com/derrian-distro/LoRA_Easy_Training_Scripts -b dev
and doing the right settings to prevent the memory overflow fixed it.
I'm sorry if this is a dumb question. I'm new to this, and I'm trying to learn how to do LoRAs. Most of the settings I imported are from this guide: https://civitai.com/models/22530/guide-make-your-own-loras-easy-and-free
To be honest, I couldn't find any guide on how to use this training script correctly. I'm pretty sure I'm doing something wrong here.
This is the error I get. Triton is only supported for Linux, so I guess I can ignore that one. As for the Torch and cuDNN error, I have no idea what's causing it. Also, 200 hours seems way too long to me—is that normal?
OS Windows 11, rtx4090, amd ryzen 7 5800x3d