Open PENGUINADELIE opened 3 months ago
Could you post your settings, the config file so we can see if there is something wrong there.
Also things you should check:
Image.jpg
and Image.png
present. Otherwise there will be a conflict.Could you post your settings, the config file so we can see if there is something wrong there.
Also things you should check:
- Make sure there are no images with same file name, as in check that there is no
Image.jpg
andImage.png
present. Otherwise there will be a conflict.
Thank you so much for answering my question. I've put together an image of how I prepared the data and what I clicked to get this result. Do you have any idea what might be causing this? I'd be very grateful for an answer.
I am trying to create an SDXL LoRA using Runpod. My dataset consists of 25 images of women, each with a size of 1024x1024 pixels. I keep encountering error logs indicating issues with the images. All images are 1024x1024 pixels, and I've tried using both PNG and JPG formats, but the issue persists. Does anyone know how to fix this?
[Error log] File "/workspace/kohya_ss/sd-scripts/sdxl_train_network.py", line 185, in trainer.train(args) File "/workspace/kohya_ss/sd-scripts/train_network.py", line 272, in train train_dataset_group.cache_latents(vae, args.vae_batch_size, args.cache_latents_to_disk, accelerator.is_main_process) File "/workspace/kohya_ss/sd-scripts/library/train_util.py", line 2324, in cache_latents dataset.cache_latents(vae, vae_batch_size, cache_to_disk, is_main_process, file_suffix) File "/workspace/kohya_ss/sd-scripts/library/train_util.py", line 1146, in cache_latents cache_batch_latents(vae, cache_to_disk, batch, subset.flip_aug, subset.alpha_mask, subset.random_crop) File "/workspace/kohya_ss/sd-scripts/library/train_util.py", line 2772, in cache_batch_latents raise RuntimeError(f"NaN detected in latents: {info.absolute_path}") RuntimeError: NaN detected in latents: /workspace/data/img_xyzminji/40xyzminji woman/xyzminji(1).jpg Traceback (most recent call last): File "/workspace/kohya_ss/venv/bin/accelerate", line 8, in sys.exit(main()) File "/workspace/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 47, in main args.func(args) File "/workspace/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1017, in launch_command simple_launcher(args) File "/workspace/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 637, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['/workspace/kohya_ss/venv/bin/python', '/workspace/kohya_ss/sd-scripts/sdxl_train_network.py', '--config_file', '/workspace/data/model_lora/config_lora-20240729-035252.toml']' returned non-zero exit status 1. 03:53:26-826393 INFO Training has ended.
Disable the image augmentations (Crop, colour...), you can't use those while caching latents (except flip). Or disable caching of latents.
@PENGUINADELIE
You'll need to enable the No half VAE checkbox if you're getting the NaN detected error on SDXL.
Disable the image augmentations (Crop, colour...), you can't use those while caching latents (except flip). Or disable caching of latents.
Disable the image augmentations (Crop, colour...), you can't use those while caching latents (except flip). Or disable caching of latents.
Thank you so much for your reply. You helped me solve the problem!
@PENGUINADELIE
You'll need to enable the No half VAE checkbox if you're getting the NaN detected error on SDXL.
The no half vae check worked well for me, thank you so much for your answer.
@PENGUINADELIE
You'll need to enable the No half VAE checkbox if you're getting the NaN detected error on SDXL.
Helped me as well. SDXL 1.0 base model was giving me errors when training a LORA but this solved it. Thanks!
I am trying to create an SDXL LoRA using Runpod. My dataset consists of 25 images of women, each with a size of 1024x1024 pixels. I keep encountering error logs indicating issues with the images. All images are 1024x1024 pixels, and I've tried using both PNG and JPG formats, but the issue persists. Does anyone know how to fix this?
[Error log] File "/workspace/kohya_ss/sd-scripts/sdxl_train_network.py", line 185, in
trainer.train(args)
File "/workspace/kohya_ss/sd-scripts/train_network.py", line 272, in train
train_dataset_group.cache_latents(vae, args.vae_batch_size, args.cache_latents_to_disk, accelerator.is_main_process)
File "/workspace/kohya_ss/sd-scripts/library/train_util.py", line 2324, in cache_latents
dataset.cache_latents(vae, vae_batch_size, cache_to_disk, is_main_process, file_suffix)
File "/workspace/kohya_ss/sd-scripts/library/train_util.py", line 1146, in cache_latents
cache_batch_latents(vae, cache_to_disk, batch, subset.flip_aug, subset.alpha_mask, subset.random_crop)
File "/workspace/kohya_ss/sd-scripts/library/train_util.py", line 2772, in cache_batch_latents
raise RuntimeError(f"NaN detected in latents: {info.absolute_path}")
RuntimeError: NaN detected in latents: /workspace/data/img_xyzminji/40xyzminji woman/xyzminji(1).jpg
Traceback (most recent call last):
File "/workspace/kohya_ss/venv/bin/accelerate", line 8, in
sys.exit(main())
File "/workspace/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 47, in main
args.func(args)
File "/workspace/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1017, in launch_command
simple_launcher(args)
File "/workspace/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 637, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/workspace/kohya_ss/venv/bin/python', '/workspace/kohya_ss/sd-scripts/sdxl_train_network.py', '--config_file', '/workspace/data/model_lora/config_lora-20240729-035252.toml']' returned non-zero exit status 1.
03:53:26-826393 INFO Training has ended.