I am a newbie in AI training. When I did my first lora training, I was unable to continue due to an error.Graphics card is 4090.
Thanks for your help.
[Error]:
[Dataset 0]
loading image sizes.
100%|████████████████████████████████████████████████████████████████████████████████| 18/18 [00:00<00:00, 4514.86it/s]
make buckets
min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is set, because bucket reso is defined by image size automatically / bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計算されるため、min_bucket_resoとmax_bucket_resoは無視されます
number of images (including repeats) / 各bucketの画像枚数(繰り返し回数を含む)
bucket 0: resolution (128, 192), count: 40
bucket 1: resolution (192, 192), count: 80
bucket 2: resolution (192, 256), count: 80
bucket 3: resolution (256, 192), count: 40
bucket 4: resolution (256, 320), count: 40
bucket 5: resolution (384, 512), count: 40
bucket 6: resolution (384, 576), count: 160
bucket 7: resolution (448, 512), count: 40
bucket 8: resolution (448, 576), count: 40
bucket 9: resolution (512, 448), count: 40
bucket 10: resolution (640, 384), count: 120
mean ar error (without repeats): 0.0708142223913111
prepare accelerator
loading model for process 0/1
load Diffusers pretrained models: runwayml/stable-diffusion-v1-5
text_encoder\model.safetensors not found
Loading pipeline components...: 100%|████████████████████████████████████████████████████| 5/5 [00:01<00:00, 3.45it/s]
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing safety_checker=None. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .
UNet2DConditionModel: 64, 8, 768, False, False
U-Net converted to original U-Net
Enable xformers for U-Net
[Dataset 0]
caching latents.
checking cache validity...
100%|████████████████████████████████████████████████████████████████████████████████| 18/18 [00:00<00:00, 1806.07it/s]
caching latents...
0it [00:00, ?it/s]
prepare optimizer, data loader etc.
CUDA SETUP: Loading binary E:\kohya_ss\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cuda116.dll...
use 8-bit AdamW optimizer | {}
Traceback (most recent call last):
File "E:\kohya_ss\train_db.py", line 488, in
train(args)
File "E:\kohya_ss\train_db.py", line 212, in train
unet, text_encoder, optimizer, train_dataloader, lr_scheduler = accelerator.prepare(
File "E:\kohya_ss\venv\lib\site-packages\accelerate\accelerator.py", line 1284, in prepare
result = tuple(
File "E:\kohya_ss\venv\lib\site-packages\accelerate\accelerator.py", line 1285, in
self._prepare_one(obj, first_pass=True, device_placement=d) for obj, d in zip(args, device_placement)
File "E:\kohya_ss\venv\lib\site-packages\accelerate\accelerator.py", line 1090, in _prepare_one
return self.prepare_model(obj, device_placement=device_placement)
File "E:\kohya_ss\venv\lib\site-packages\accelerate\accelerator.py", line 1489, in prepare_model
model = torch.compile(model, **self.state.dynamo_plugin.to_kwargs())
File "E:\kohya_ss\venv\lib\site-packages\torch__init.py", line 1441, in compile
return torch._dynamo.optimize(backend=backend, nopython=fullgraph, dynamic=dynamic, disable=disable)(model)
File "E:\kohya_ss\venv\lib\site-packages\torch_dynamo\eval_frame.py", line 413, in optimize
check_if_dynamo_supported()
File "E:\kohya_ss\venv\lib\site-packages\torch_dynamo\eval_frame.py", line 375, in check_if_dynamo_supported
raise RuntimeError("Windows not yet supported for torch.compile")
RuntimeError: Windows not yet supported for torch.compile
Traceback (most recent call last):
File "C:\Users\hugua\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\hugua\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "E:\kohya_ss\venv\Scripts\accelerate.exe\main__.py", line 7, in
File "E:\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 47, in main
args.func(args)
File "E:\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 986, in launch_command
simple_launcher(args)
File "E:\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 628, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['E:\kohya_ss\venv\Scripts\python.exe', './train_db.py', '--enable_bucket', '--min_bucket_reso=256', '--max_bucket_reso=2048', '--pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5', '--train_data_dir=E:/loratest/image', '--resolution=512,512', '--output_dir=E:/loratest/model', '--logging_dir=E:/loratest/log', '--save_model_as=safetensors', '--output_name=Clash', '--lr_scheduler_num_cycles=3', '--max_data_loader_n_workers=0', '--learning_rate=0.0001', '--lr_scheduler=cosine_with_restarts', '--lr_warmup_steps=216', '--train_batch_size=1', '--max_train_steps=2160', '--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--caption_extension=.txt', '--cache_latents', '--cache_latents_to_disk', '--optimizer_type=AdamW8bit', '--max_data_loader_n_workers=0', '--bucket_reso_steps=64', '--min_snr_gamma=10', '--xformers', '--bucket_no_upscale', '--multires_noise_iterations=8', '--multires_noise_discount=0.2']' returned non-zero exit status 1.
I am a newbie in AI training. When I did my first lora training, I was unable to continue due to an error.Graphics card is 4090. Thanks for your help. [Error]:
[Dataset 0] loading image sizes. 100%|████████████████████████████████████████████████████████████████████████████████| 18/18 [00:00<00:00, 4514.86it/s] make buckets min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is set, because bucket reso is defined by image size automatically / bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計算されるため、min_bucket_resoとmax_bucket_resoは無視されます number of images (including repeats) / 各bucketの画像枚数(繰り返し回数を含む) bucket 0: resolution (128, 192), count: 40 bucket 1: resolution (192, 192), count: 80 bucket 2: resolution (192, 256), count: 80 bucket 3: resolution (256, 192), count: 40 bucket 4: resolution (256, 320), count: 40 bucket 5: resolution (384, 512), count: 40 bucket 6: resolution (384, 576), count: 160 bucket 7: resolution (448, 512), count: 40 bucket 8: resolution (448, 576), count: 40 bucket 9: resolution (512, 448), count: 40 bucket 10: resolution (640, 384), count: 120 mean ar error (without repeats): 0.0708142223913111 prepare accelerator loading model for process 0/1 load Diffusers pretrained models: runwayml/stable-diffusion-v1-5 text_encoder\model.safetensors not found Loading pipeline components...: 100%|████████████████████████████████████████████████████| 5/5 [00:01<00:00, 3.45it/s] You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing
safety_checker=None
. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 . UNet2DConditionModel: 64, 8, 768, False, False U-Net converted to original U-Net Enable xformers for U-Net [Dataset 0] caching latents. checking cache validity... 100%|████████████████████████████████████████████████████████████████████████████████| 18/18 [00:00<00:00, 1806.07it/s] caching latents... 0it [00:00, ?it/s] prepare optimizer, data loader etc.===================================BUG REPORT=================================== Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues For effortless bug reporting copy-paste your error into this form: https://docs.google.com/forms/d/e/1FAIpQLScPB8emS3Thkp66nvqwmjTEgxp8Y9ufuWTzFyr9kJ5AoI47dQ/viewform?usp=sf_link
CUDA SETUP: Loading binary E:\kohya_ss\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cuda116.dll... use 8-bit AdamW optimizer | {} Traceback (most recent call last): File "E:\kohya_ss\train_db.py", line 488, in
train(args)
File "E:\kohya_ss\train_db.py", line 212, in train
unet, text_encoder, optimizer, train_dataloader, lr_scheduler = accelerator.prepare(
File "E:\kohya_ss\venv\lib\site-packages\accelerate\accelerator.py", line 1284, in prepare
result = tuple(
File "E:\kohya_ss\venv\lib\site-packages\accelerate\accelerator.py", line 1285, in
self._prepare_one(obj, first_pass=True, device_placement=d) for obj, d in zip(args, device_placement)
File "E:\kohya_ss\venv\lib\site-packages\accelerate\accelerator.py", line 1090, in _prepare_one
return self.prepare_model(obj, device_placement=device_placement)
File "E:\kohya_ss\venv\lib\site-packages\accelerate\accelerator.py", line 1489, in prepare_model
model = torch.compile(model, **self.state.dynamo_plugin.to_kwargs())
File "E:\kohya_ss\venv\lib\site-packages\torch__init.py", line 1441, in compile
return torch._dynamo.optimize(backend=backend, nopython=fullgraph, dynamic=dynamic, disable=disable)(model)
File "E:\kohya_ss\venv\lib\site-packages\torch_dynamo\eval_frame.py", line 413, in optimize
check_if_dynamo_supported()
File "E:\kohya_ss\venv\lib\site-packages\torch_dynamo\eval_frame.py", line 375, in check_if_dynamo_supported
raise RuntimeError("Windows not yet supported for torch.compile")
RuntimeError: Windows not yet supported for torch.compile
Traceback (most recent call last):
File "C:\Users\hugua\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\hugua\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "E:\kohya_ss\venv\Scripts\accelerate.exe\main__.py", line 7, in
File "E:\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 47, in main
args.func(args)
File "E:\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 986, in launch_command
simple_launcher(args)
File "E:\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 628, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['E:\kohya_ss\venv\Scripts\python.exe', './train_db.py', '--enable_bucket', '--min_bucket_reso=256', '--max_bucket_reso=2048', '--pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5', '--train_data_dir=E:/loratest/image', '--resolution=512,512', '--output_dir=E:/loratest/model', '--logging_dir=E:/loratest/log', '--save_model_as=safetensors', '--output_name=Clash', '--lr_scheduler_num_cycles=3', '--max_data_loader_n_workers=0', '--learning_rate=0.0001', '--lr_scheduler=cosine_with_restarts', '--lr_warmup_steps=216', '--train_batch_size=1', '--max_train_steps=2160', '--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--caption_extension=.txt', '--cache_latents', '--cache_latents_to_disk', '--optimizer_type=AdamW8bit', '--max_data_loader_n_workers=0', '--bucket_reso_steps=64', '--min_snr_gamma=10', '--xformers', '--bucket_no_upscale', '--multires_noise_iterations=8', '--multires_noise_discount=0.2']' returned non-zero exit status 1.