Closed Esvictum closed 6 months ago
Here's how you read those errors:
tensorflow.python.framework.errors_impl.FailedPreconditionError: D:/AI Zeug/Dump für alte Bilder und Lora source/Lora source Spec/log is not a directory
Did something happen to your "D:\AI Zeug\Dump für alte Bilder und Lora source\Lora source Spec\log"?
Make sure the log directory exists. Then see if this helps: https://github.com/bmaltais/kohya_ss/discussions/1744 Since you are using german, the Ü can cause issues, try replacing it. I know that Ä Ö Å for example can cause an issue of directory errors with bitsandbytes. See if this helps.
Make sure the log directory exists. Then see if this helps: #1744 Since you are using german, the Ü can cause issues, try replacing it. I know that Ä Ö Å for example can cause an issue of directory errors with bitsandbytes. See if this helps.
Good point. Whenever a problem involving paths occurs, it's generally a good idea to try switching to paths containing only characters from the POSIX Portable Filename Character Set (ASCII alphanumerics, .
, _
, and -
) to rule that out. (eg. I've had LyX projects refuse to render because LaTeX modules weren't properly tested with paths containing spaces.)
I cant believe it. I removed the "ü" and now it just works....after reinstalling everything for hours. I would have never guessed that "Umlaute" are considered evil in the computer and AI world. Thank you very much.
It's more that some programs and programming languages haven't been properly tested with Unicode, so they have trouble with anything outside the set of characters in ASCII (which are common to almost all legacy encoding systems, bar a few like Shift-JIS.)
In Kohya's case, it's probably just that Python has some legacy baggage that makes it easy to accidentally do Unicode wrong. (Source: I've been coding in Python since I was a teenager.)
@Esvictum If your issue is resolved, may you consider closing this issue.
Hello and Welcome to my problem, my program does not work anymore, a month ago it did however. Unfortunally i make changes in my setup with just halve of my brain, so here we are.
Now, when i run my A1111 Kohoya Dreambooth Lora Trainer, this is what happens. I am thankful for any suggestion.
17:46:52-538662 INFO Start training LoRA Standard ... 17:46:52-539663 INFO Checking for duplicate image filenames in training data directory... 17:46:52-543667 INFO Valid image folder names found in: D:/AI Zeug/Dump für alte Bilder und Lora source/Lora source Spec/image 17:46:52-548671 INFO Folder 100_Spec: 27 images found 17:46:52-548671 INFO Folder 100_Spec: 2700 steps 17:46:52-549671 INFO Total steps: 2700 17:46:52-550673 INFO Train batch size: 1 17:46:52-551674 INFO Gradient accumulation steps: 1 17:46:52-552674 INFO Epoch: 1 17:46:52-552674 INFO Regulatization factor: 1 17:46:52-553675 INFO max_train_steps (2700 / 1 / 1 1 1) = 2700 17:46:52-554676 INFO stop_text_encoder_training = 0 17:46:52-555677 INFO lr_warmup_steps = 270 17:46:52-556678 INFO Saving training config to D:/AI Zeug/Dump für alte Bilder und Lora source/Lora source Spec/output\spec_20240218-174652.json... 17:46:52-558680 INFO accelerate launch --num_cpu_threads_per_process=2 "./train_network.py" --bucket_no_upscale --bucket_reso_steps=64 --cache_latents --caption_extension=".txt" --enable_bucket --min_bucket_reso=256 --max_bucket_reso=2048 --learning_rate="0.0001" --logging_dir="D:/AI Zeug/Dump für alte Bilder und Lora source/Lora source Spec/log" --lr_scheduler="cosine" --lr_scheduler_num_cycles="1" --lr_warmup_steps="270" --max_data_loader_n_workers="0" --max_grad_norm="1" --resolution="512,512" --max_train_steps="2700" --mixed_precision="fp16" --network_alpha="1" --network_dim=8 --network_module=networks.lora --optimizer_type="AdamW8bit" --output_dir="D:/AI Zeug/Dump für alte Bilder und Lora source/Lora source Spec/output" --output_name="pendulum" --pretrained_model_name_or_path="D:/AI Interface/webui/models/Stable-diffusion/realisticVisionV60B1_v60B1VAE.safetensors" --save_every_n_epochs="1" --save_model_as=safetensors --save_precision="float" --text_encoder_lr=0.0001 --train_batch_size="1" --train_data_dir="D:/AI Zeug/Dump für alte Bilder und Lora source/Lora source Spec/image" --unet_lr=0.0001 --xformers A matching Triton is not available, some optimizations will not be enabled. Error caught was: No module named 'triton' prepare tokenizer Using DreamBooth method. prepare images. found directory D:\AI Zeug\Dump für alte Bilder und Lora source\Lora source Spec\image\100_Spec contains 27 image files 2700 train images with repeating. 0 reg images. no regularization images / 正則化画像が見つかりませんでした [Dataset 0] batch_size: 1 resolution: (512, 512) enable_bucket: True network_multiplier: 1.0 min_bucket_reso: 256 max_bucket_reso: 2048 bucket_reso_steps: 64 bucket_no_upscale: True
[Subset 0 of Dataset 0] image_dir: "D:\AI Zeug\Dump für alte Bilder und Lora source\Lora source Spec\image\100_Spec" image_count: 27 num_repeats: 100 shuffle_caption: False keep_tokens: 0 keep_tokens_separator: caption_dropout_rate: 0.0 caption_dropout_every_n_epoches: 0 caption_tag_dropout_rate: 0.0 caption_prefix: None caption_suffix: None color_aug: False flip_aug: False face_crop_aug_range: None random_crop: False token_warmup_min: 1, token_warmup_step: 0, is_reg: False class_tokens: Spec caption_extension: .txt
[Dataset 0] loading image sizes. 100%|████████████████████████████████████████████████████████████████████████████████| 27/27 [00:00<00:00, 3372.43it/s] make buckets min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is set, because bucket reso is defined by image size automatically / bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計算されるため、min_bucket_resoとmax_bucket_resoは無視されます number of images (including repeats) / 各bucketの画像枚数(繰り返し回数を含む) bucket 0: resolution (384, 512), count: 300 bucket 1: resolution (448, 320), count: 900 bucket 2: resolution (640, 384), count: 1500 mean ar error (without repeats): 0.08395061728395048 preparing accelerator loading model for process 0/1 load StableDiffusion checkpoint: D:/AI Interface/webui/models/Stable-diffusion/realisticVisionV60B1_v60B1VAE.safetensors UNet2DConditionModel: 64, 8, 768, False, False loading u-net:
loading vae:
loading text encoder:
Enable xformers for U-Net
import network module: networks.lora
[Dataset 0]
caching latents.
checking cache validity...
100%|██████████████████████████████████████████████████████████████████████████████████████████| 27/27 [00:00<?, ?it/s]
caching latents...
100%|██████████████████████████████████████████████████████████████████████████████████| 27/27 [00:01<00:00, 16.46it/s]
create LoRA network. base dim (rank): 8, alpha: 1.0
neuron dropout: p=None, rank dropout: p=None, module dropout: p=None
create LoRA for Text Encoder:
create LoRA for Text Encoder: 72 modules.
create LoRA for U-Net: 192 modules.
enable LoRA for text encoder
enable LoRA for U-Net
prepare optimizer, data loader etc.
===================================BUG REPORT=================================== Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
binary_path: D:\Kohya\kohya_ss\venv\lib\site-packages\bitsandbytes\cuda_setup\libbitsandbytes_cuda116.dll CUDA SETUP: Loading binary D:\Kohya\kohya_ss\venv\lib\site-packages\bitsandbytes\cuda_setup\libbitsandbytes_cuda116.dll... use 8-bit AdamW optimizer | {} running training / 学習開始 num train images repeats / 学習画像の数×繰り返し回数: 2700 num reg images / 正則化画像の数: 0 num batches per epoch / 1epochのバッチ数: 2700 num epochs / epoch数: 1 batch size per device / バッチサイズ: 1 gradient accumulation steps / 勾配を合計するステップ数 = 1 total optimization steps / 学習ステップ数: 2700 steps: 0%| | 0/2700 [00:00<?, ?it/s]Traceback (most recent call last): File "D:\Kohya\kohya_ss\train_network.py", line 1033, in
trainer.train(args)
File "D:\Kohya\kohya_ss\train_network.py", line 701, in train
accelerator.init_trackers(
File "D:\Kohya\kohya_ss\venv\lib\site-packages\accelerate\accelerator.py", line 619, in _inner
return PartialState().on_main_process(function)( args, kwargs)
File "D:\Kohya\kohya_ss\venv\lib\site-packages\accelerate\accelerator.py", line 2331, in init_trackers
tracker_init(project_name, self.logging_dir, init_kwargs.get(str(tracker), {}))
File "D:\Kohya\kohya_ss\venv\lib\site-packages\accelerate\tracking.py", line 79, in execute_on_main_process
return PartialState().on_main_process(function)(self, *args, kwargs)
File "D:\Kohya\kohya_ss\venv\lib\site-packages\accelerate\tracking.py", line 190, in init
self.writer = tensorboard.SummaryWriter(self.logging_dir, kwargs)
File "D:\Kohya\kohya_ss\venv\lib\site-packages\torch\utils\tensorboard\writer.py", line 243, in init
self._get_file_writer()
File "D:\Kohya\kohya_ss\venv\lib\site-packages\torch\utils\tensorboard\writer.py", line 273, in _get_file_writer
self.file_writer = FileWriter(
File "D:\Kohya\kohya_ss\venv\lib\site-packages\torch\utils\tensorboard\writer.py", line 72, in init
self.event_writer = EventFileWriter(
File "D:\Kohya\kohya_ss\venv\lib\site-packages\tensorboard\summary\writer\event_file_writer.py", line 72, in init
tf.io.gfile.makedirs(logdir)
File "D:\Kohya\kohya_ss\venv\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 513, in recursive_create_dir_v2
_pywrap_file_io.RecursivelyCreateDir(compat.path_to_bytes(path))
tensorflow.python.framework.errors_impl.FailedPreconditionError: D:/AI Zeug/Dump für alte Bilder und Lora source/Lora source Spec/log is not a directory
steps: 0%| | 0/2700 [00:00<?, ?it/s]
Traceback (most recent call last):
File "C:\Users\Johny\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\Johny\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "D:\Kohya\kohya_ss\venv\Scripts\accelerate.exe__main__.py", line 7, in
File "D:\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 47, in main
args.func(args)
File "D:\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1017, in launch_command
simple_launcher(args)
File "D:\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 637, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['D:\Kohya\kohya_ss\venv\Scripts\python.exe', './train_network.py', '--bucket_no_upscale', '--bucket_reso_steps=64', '--cache_latents', '--caption_extension=.txt', '--enable_bucket', '--min_bucket_reso=256', '--max_bucket_reso=2048', '--learning_rate=0.0001', '--logging_dir=D:/AI Zeug/Dump für alte Bilder und Lora source/Lora source Spec/log', '--lr_scheduler=cosine', '--lr_scheduler_num_cycles=1', '--lr_warmup_steps=270', '--max_data_loader_n_workers=0', '--max_grad_norm=1', '--resolution=512,512', '--max_train_steps=2700', '--mixed_precision=fp16', '--network_alpha=1', '--network_dim=8', '--network_module=networks.lora', '--optimizer_type=AdamW8bit', '--output_dir=D:/AI Zeug/Dump für alte Bilder und Lora source/Lora source Spec/output', '--output_name=pendulum', '--pretrained_model_name_or_path=D:/AI Interface/webui/models/Stable-diffusion/realisticVisionV60B1_v60B1VAE.safetensors', '--save_every_n_epochs=1', '--save_model_as=safetensors', '--save_precision=float', '--text_encoder_lr=0.0001', '--train_batch_size=1', '--train_data_dir=D:/AI Zeug/Dump für alte Bilder und Lora source/Lora source Spec/image', '--unet_lr=0.0001', '--xformers']' returned non-zero exit status 1.