Closed bulb1czek closed 6 months ago
Look like there is an issue with the installation of torchvision... Did you install all the windows pre-requirements as instructed? Making sure to install CUDA as specified?
Look like there is an issue with the installation of torchvision... Did you install all the windows pre-requirements as instructed? Making sure to install CUDA as specified?
installed Cuda and installed it for visual studio 2022 Installed python And Git.
"20:52:18-460137 INFO Start training LoRA Standard ... 20:52:18-462142 INFO Validating lr scheduler arguments... 20:52:18-463126 INFO Validating optimizer arguments... 20:52:18-465120 INFO Validating model file or folder path runwayml/stable-diffusion-v1-5 existence... 20:52:18-466117 INFO ...huggingface.co model, skipping validation 20:52:18-467213 INFO Validating output_dir path C:/Users/admin/Documents/ai/Training/Otter/model existence... 20:52:18-469205 INFO ...valid 20:52:18-470174 INFO Validating train_data_dir path C:/Users/admin/Documents/ai/Training/Otter/img existence... 20:52:18-472674 INFO ...valid 20:52:18-474894 INFO reg_data_dir not specified, skipping validation 20:52:18-475886 INFO Validating logging_dir path C:/Users/admin/Documents/ai/Training/Otter/model existence... 20:52:18-476884 INFO ...valid 20:52:18-478951 INFO log_tracker_config not specified, skipping validation 20:52:18-479948 INFO resume not specified, skipping validation 20:52:18-481079 INFO vae not specified, skipping validation 20:52:18-482073 INFO network_weights not specified, skipping validation 20:52:18-483071 INFO dataset_config not specified, skipping validation 20:52:18-486063 INFO Folder 115_OTTAH otter: 115 repeats found 20:52:18-487592 INFO Folder 115_OTTAH otter: 12 images found 20:52:18-488585 INFO Folder 115_OTTAH otter: 12 * 115 = 1380 steps 20:52:18-491414 INFO Regulatization factor: 1 20:52:18-495420 INFO Total steps: 1380 20:52:18-522061 INFO Train batch size: 1 20:52:18-523055 INFO Gradient accumulation steps: 1 20:52:18-530612 INFO Epoch: 1 20:52:18-531606 INFO Max train steps: 1600 20:52:18-532604 INFO stop_text_encoder_training = 0 20:52:18-533709 INFO lr_warmup_steps = 0 20:52:18-535718 INFO Saving training config to C:/Users/admin/Documents/ai/Training/Otter/model\Otter_20240425-205218.json... 20:52:18-541610 INFO Executing command: "C:\Users\admin\Documents\ai\khoya\kohya_ss\venv\Scripts\accelerate.EXE" launch --dynamo_backend no --dynamo_mode default --mixed_precision fp16 --num_processes 1 --num_machines 1 --num_cpu_threads_per_process 2 "C:/Users/admin/Documents/ai/khoya/kohya_ss/sd-scripts/train_network.py" --config_file "./outputs/tmpfilelora.toml" with shell=True 20:52:18-545182 INFO Command executed. 2024-04-25 20:52:30 INFO Loading settings from ./outputs/tmpfilelora.toml... train_util.py:3744 INFO ./outputs/tmpfilelora train_util.py:3763 2024-04-25 20:52:30 INFO prepare tokenizer train_util.py:4227 2024-04-25 20:52:31 INFO update token length: 75 train_util.py:4244 INFO Using DreamBooth method. train_network.py:172 INFO prepare images. train_util.py:1572 INFO found directory C:\Users\admin\Documents\ai\Training\Otter\img\115_OTTAH train_util.py:1519 otter contains 12 image files INFO 1380 train images with repeating. train_util.py:1613 INFO 0 reg images. train_util.py:1616 WARNING no regularization images / 正則化画像が見つかりませんでした train_util.py:1621 INFO [Dataset 0] config_util.py:565 batch_size: 1 resolution: (512, 512) enable_bucket: True network_multiplier: 1.0 min_bucket_reso: 256 max_bucket_reso: 2048 bucket_reso_steps: 64 bucket_no_upscale: True
[Subset 0 of Dataset 0]
image_dir: "C:\Users\admin\Documents\ai\Training\Otter\img\115_OTTAH
otter"
image_count: 12
num_repeats: 115
shuffle_caption: False
keep_tokens: 0
keep_tokens_separator:
secondary_separator: None
enable_wildcard: False
caption_dropout_rate: 0.0
caption_dropout_every_n_epoches: 0
caption_tag_dropout_rate: 0.0
caption_prefix: None
caption_suffix: None
color_aug: False
flip_aug: False
face_crop_aug_range: None
random_crop: False
token_warmup_min: 1,
token_warmup_step: 0,
is_reg: False
class_tokens: OTTAH otter
caption_extension: .txt
INFO [Dataset 0] config_util.py:571
INFO loading image sizes. train_util.py:853
100%|████████████████████████████████████████████████████████████████████████████████| 12/12 [00:00<00:00, 1394.73it/s]
INFO make buckets train_util.py:859
WARNING min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is train_util.py:876
set, because bucket reso is defined by image size automatically /
bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計
算されるため、min_bucket_resoとmax_bucket_resoは無視されます
INFO number of images (including repeats) / train_util.py:905
各bucketの画像枚数(繰り返し回数を含む)
INFO bucket 0: resolution (192, 256), count: 115 train_util.py:910
INFO bucket 1: resolution (192, 384), count: 115 train_util.py:910
INFO bucket 2: resolution (384, 512), count: 345 train_util.py:910
INFO bucket 3: resolution (448, 512), count: 115 train_util.py:910
INFO bucket 4: resolution (512, 384), count: 115 train_util.py:910
INFO bucket 5: resolution (512, 512), count: 345 train_util.py:910
INFO bucket 6: resolution (640, 384), count: 230 train_util.py:910
INFO mean ar error (without repeats): 0.028953594419661194 train_util.py:915
INFO preparing accelerator train_network.py:225
accelerator device: cpu
INFO loading model for process 0/1 train_util.py:4385
INFO load Diffusers pretrained models: runwayml/stable-diffusion-v1-5 train_util.py:4347
Loading pipeline components...: 100%|████████████████████████████████████████████████████| 5/5 [00:00<00:00, 9.13it/s]
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing safety_checker=None
. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .
2024-04-25 20:52:32 INFO UNet2DConditionModel: 64, 8, 768, False, False original_unet.py:1387
2024-04-25 20:53:02 INFO U-Net converted to original U-Net train_util.py:4372
INFO Enable memory efficient attention for U-Net train_util.py:2657
Traceback (most recent call last):
File "C:\Users\admin\Documents\ai\khoya\kohya_ss\sd-scripts\train_network.py", line 1115, in
Can you try training with the base as model instead? Not sure what model you use… but if it not the base, try with it…
Can you try training with the base as model I stead? Not sure what model you use… but if it not the base, try with it…
im using "runwayml/stable-diffusion-v1-5" !
Can you share the config file so I can try it on my system? If I can't reproduce the issue then it must be something specific to your system.
Can you share the config file so I can try it on my system? If I can't reproduce the issue then it must be something specific to your system.
Well... it train just fine on my system. Here is the log for the training:
19:44:45-473702 INFO Kohya_ss GUI version: v24.0.8
19:44:45-959154 INFO Submodule initialized and updated.
19:44:45-962152 INFO nVidia toolkit detected
19:44:48-865444 INFO Torch 2.1.2+cu118
19:44:48-887051 INFO Torch backend: nVidia CUDA 11.8 cuDNN 8905
19:44:48-890059 INFO Torch detected GPU: NVIDIA GeForce RTX 3090 VRAM 24576 Arch (8, 6) Cores 82
19:44:48-894054 INFO Python version is 3.10.11 (tags/v3.10.11:7d4cc5a, Apr 5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)]
19:44:48-895054 INFO Verifying modules installation status from requirements_pytorch_windows.txt...
19:44:48-902213 INFO Verifying modules installation status from requirements_windows.txt...
19:44:48-907725 INFO Verifying modules installation status from requirements.txt...
19:44:59-575359 INFO headless: False
19:44:59-631209 INFO Using shell=True when running external commands...
Running on local URL: http://127.0.0.1:7860
To create a public link, set `share=True` in `launch()`.
19:46:07-684896 INFO Loading config...
19:47:43-236310 INFO Start training LoRA Standard ...
19:47:43-238310 INFO Validating lr scheduler arguments...
19:47:43-239313 INFO Validating optimizer arguments...
19:47:43-240311 INFO Validating model file or folder path runwayml/stable-diffusion-v1-5 existence...
19:47:43-241310 INFO ...huggingface.co model, skipping validation
19:47:43-242310 INFO Validating output_dir path D:/kohya_ss/outputs existence...
19:47:43-242310 INFO ...valid
19:47:43-243309 INFO Validating train_data_dir path D:/kohya_ss/test/img existence...
19:47:43-244310 INFO ...valid
19:47:43-245309 INFO reg_data_dir not specified, skipping validation
19:47:43-245309 INFO Validating logging_dir path D:/kohya_ss/outputs/Otter/logs existence...
19:47:43-247311 INFO ...created folder at D:/kohya_ss/outputs/Otter/logs
19:47:43-248542 INFO log_tracker_config not specified, skipping validation
19:47:43-249541 INFO resume not specified, skipping validation
19:47:43-250542 INFO vae not specified, skipping validation
19:47:43-251542 INFO network_weights not specified, skipping validation
19:47:43-252544 INFO dataset_config not specified, skipping validation
19:47:43-253541 INFO Folder 10_darius kawasaki person: 10 repeats found
19:47:43-255541 INFO Folder 10_darius kawasaki person: 8 images found
19:47:43-256542 INFO Folder 10_darius kawasaki person: 8 * 10 = 80 steps
19:47:43-257544 INFO Regulatization factor: 1
19:47:43-258544 INFO Total steps: 80
19:47:43-259543 INFO Train batch size: 1
19:47:43-260541 INFO Gradient accumulation steps: 1
19:47:43-261544 INFO Epoch: 1
19:47:43-262544 INFO Max train steps: 1600
19:47:43-263544 INFO stop_text_encoder_training = 0
19:47:43-264551 INFO lr_warmup_steps = 0
19:47:43-266550 INFO Saving training config to D:/kohya_ss/outputs\Otter_20240426-194743.json...
19:47:43-268549 INFO Executing command: "D:\kohya_ss\venv\Scripts\accelerate.EXE" launch --dynamo_backend no --dynamo_mode default --mixed_precision fp16 --num_processes 1 --num_machines 1
--num_cpu_threads_per_process 2 "D:/kohya_ss/sd-scripts/train_network.py" --config_file "./outputs/tmpfilelora.toml" with shell=True
19:47:43-275550 INFO Command executed.
2024-04-26 19:47:50 WARNING A matching Triton is not available, some optimizations will not be enabled. __init__.py:55
Error caught was: No module named 'triton'
2024-04-26 19:47:52 INFO Loading settings from ./outputs/tmpfilelora.toml... train_util.py:3744
INFO ./outputs/tmpfilelora train_util.py:3763
2024-04-26 19:47:52 INFO prepare tokenizer train_util.py:4227
2024-04-26 19:47:53 INFO update token length: 75 train_util.py:4244
INFO Using DreamBooth method. train_network.py:172
INFO prepare images. train_util.py:1572
INFO found directory D:\kohya_ss\test\img\10_darius kawasaki person contains 8 image files train_util.py:1519
INFO 80 train images with repeating. train_util.py:1613
INFO 0 reg images. train_util.py:1616
WARNING no regularization images / 正則化画像が見つかりませんでした train_util.py:1621
INFO [Dataset 0] config_util.py:565
batch_size: 1
resolution: (512, 512)
enable_bucket: True
network_multiplier: 1.0
min_bucket_reso: 256
max_bucket_reso: 2048
bucket_reso_steps: 64
bucket_no_upscale: True
[Subset 0 of Dataset 0]
image_dir: "D:\kohya_ss\test\img\10_darius kawasaki person"
image_count: 8
num_repeats: 10
shuffle_caption: False
keep_tokens: 0
keep_tokens_separator:
secondary_separator: None
enable_wildcard: False
caption_dropout_rate: 0.0
caption_dropout_every_n_epoches: 0
caption_tag_dropout_rate: 0.0
caption_prefix: None
caption_suffix: None
color_aug: False
flip_aug: False
face_crop_aug_range: None
random_crop: False
token_warmup_min: 1,
token_warmup_step: 0,
is_reg: False
class_tokens: darius kawasaki person
caption_extension: .txt
INFO [Dataset 0] config_util.py:571
INFO loading image sizes. train_util.py:853
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:00<00:00, 331.32it/s]
INFO make buckets train_util.py:859
WARNING min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is set, because bucket reso is defined by image size automatically / train_util.py:876
bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計算されるため、min_bucket_resoとmax_bucket_resoは無視されます
INFO number of images (including repeats) / 各bucketの画像枚数(繰り返し回数を含む) train_util.py:905
INFO bucket 0: resolution (512, 512), count: 80 train_util.py:910
INFO mean ar error (without repeats): 0.0 train_util.py:915
INFO preparing accelerator train_network.py:225
accelerator device: cuda
INFO loading model for process 0/1 train_util.py:4385
INFO load Diffusers pretrained models: runwayml/stable-diffusion-v1-5 train_util.py:4347
Loading pipeline components...: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 10.24it/s]
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing `safety_checker=None`. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .
2024-04-26 19:47:54 INFO UNet2DConditionModel: 64, 8, 768, False, False original_unet.py:1387
2024-04-26 19:48:01 INFO U-Net converted to original U-Net train_util.py:4372
2024-04-26 19:48:02 INFO Enable memory efficient attention for U-Net train_util.py:2657
import network module: networks.lora
INFO [Dataset 0] train_util.py:2079
INFO caching latents. train_util.py:974
INFO checking cache validity... train_util.py:984
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:00<?, ?it/s]
INFO caching latents... train_util.py:1021
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:02<00:00, 3.35it/s]
2024-04-26 19:48:06 INFO create LoRA network. base dim (rank): 8, alpha: 1 lora.py:810
INFO neuron dropout: p=None, rank dropout: p=None, module dropout: p=None lora.py:811
INFO create LoRA for Text Encoder: lora.py:905
INFO create LoRA for Text Encoder: 72 modules. lora.py:910
INFO create LoRA for U-Net: 192 modules. lora.py:918
INFO enable LoRA for text encoder lora.py:961
INFO enable LoRA for U-Net lora.py:966
INFO CrossAttnDownBlock2D False -> True original_unet.py:1521
INFO CrossAttnDownBlock2D False -> True original_unet.py:1521
INFO CrossAttnDownBlock2D False -> True original_unet.py:1521
INFO DownBlock2D False -> True original_unet.py:1521
INFO UNetMidBlock2DCrossAttn False -> True original_unet.py:1521
INFO UpBlock2D False -> True original_unet.py:1521
INFO CrossAttnUpBlock2D False -> True original_unet.py:1521
INFO CrossAttnUpBlock2D False -> True original_unet.py:1521
INFO CrossAttnUpBlock2D False -> True original_unet.py:1521
prepare optimizer, data loader etc.
INFO use Adafactor optimizer | {'relative_step': True} train_util.py:4047
INFO relative_step is true / relative_stepがtrueです train_util.py:4050
WARNING learning rate is used as initial_lr / 指定したlearning rateはinitial_lrとして使用されます train_util.py:4052
WARNING unet_lr and text_encoder_lr are ignored / unet_lrとtext_encoder_lrは無視されます train_util.py:4064
INFO use adafactor_scheduler / スケジューラにadafactor_schedulerを使用します train_util.py:4069
running training / 学習開始
num train images * repeats / 学習画像の数×繰り返し回数: 80
num reg images / 正則化画像の数: 0
num batches per epoch / 1epochのバッチ数: 80
num epochs / epoch数: 20
batch size per device / バッチサイズ: 1
gradient accumulation steps / 勾配を合計するステップ数 = 1
total optimization steps / 学習ステップ数: 1600
steps: 0%| | 0/1600 [00:00<?, ?it/s]
epoch 1/20
2024-04-26 19:48:09 WARNING A matching Triton is not available, some optimizations will not be enabled. __init__.py:55
Error caught was: No module named 'triton'
D:\kohya_ss\venv\lib\site-packages\torch\utils\checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
warnings.warn(
steps: 2%|██▌ | 26/1600 [00:36<36:49, 1.40s/it, avr_loss=0.198]19:48:44-084123 INFO The running process has been terminated.
19:48:44-978963 INFO Training has ended.
Here is a copy of the json config that use the test images and folders in the kohya_ss folder itself... so they should run as is on your computer. It they don't work then the problem is with the software / drivers installed on your machine.
Well... it train just fine on my system. Here is the log for the training:
19:44:45-473702 INFO Kohya_ss GUI version: v24.0.8 19:44:45-959154 INFO Submodule initialized and updated. 19:44:45-962152 INFO nVidia toolkit detected 19:44:48-865444 INFO Torch 2.1.2+cu118 19:44:48-887051 INFO Torch backend: nVidia CUDA 11.8 cuDNN 8905 19:44:48-890059 INFO Torch detected GPU: NVIDIA GeForce RTX 3090 VRAM 24576 Arch (8, 6) Cores 82 19:44:48-894054 INFO Python version is 3.10.11 (tags/v3.10.11:7d4cc5a, Apr 5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)] 19:44:48-895054 INFO Verifying modules installation status from requirements_pytorch_windows.txt... 19:44:48-902213 INFO Verifying modules installation status from requirements_windows.txt... 19:44:48-907725 INFO Verifying modules installation status from requirements.txt... 19:44:59-575359 INFO headless: False 19:44:59-631209 INFO Using shell=True when running external commands... Running on local URL: http://127.0.0.1:7860 To create a public link, set `share=True` in `launch()`. 19:46:07-684896 INFO Loading config... 19:47:43-236310 INFO Start training LoRA Standard ... 19:47:43-238310 INFO Validating lr scheduler arguments... 19:47:43-239313 INFO Validating optimizer arguments... 19:47:43-240311 INFO Validating model file or folder path runwayml/stable-diffusion-v1-5 existence... 19:47:43-241310 INFO ...huggingface.co model, skipping validation 19:47:43-242310 INFO Validating output_dir path D:/kohya_ss/outputs existence... 19:47:43-242310 INFO ...valid 19:47:43-243309 INFO Validating train_data_dir path D:/kohya_ss/test/img existence... 19:47:43-244310 INFO ...valid 19:47:43-245309 INFO reg_data_dir not specified, skipping validation 19:47:43-245309 INFO Validating logging_dir path D:/kohya_ss/outputs/Otter/logs existence... 19:47:43-247311 INFO ...created folder at D:/kohya_ss/outputs/Otter/logs 19:47:43-248542 INFO log_tracker_config not specified, skipping validation 19:47:43-249541 INFO resume not specified, skipping validation 19:47:43-250542 INFO vae not specified, skipping validation 19:47:43-251542 INFO network_weights not specified, skipping validation 19:47:43-252544 INFO dataset_config not specified, skipping validation 19:47:43-253541 INFO Folder 10_darius kawasaki person: 10 repeats found 19:47:43-255541 INFO Folder 10_darius kawasaki person: 8 images found 19:47:43-256542 INFO Folder 10_darius kawasaki person: 8 * 10 = 80 steps 19:47:43-257544 INFO Regulatization factor: 1 19:47:43-258544 INFO Total steps: 80 19:47:43-259543 INFO Train batch size: 1 19:47:43-260541 INFO Gradient accumulation steps: 1 19:47:43-261544 INFO Epoch: 1 19:47:43-262544 INFO Max train steps: 1600 19:47:43-263544 INFO stop_text_encoder_training = 0 19:47:43-264551 INFO lr_warmup_steps = 0 19:47:43-266550 INFO Saving training config to D:/kohya_ss/outputs\Otter_20240426-194743.json... 19:47:43-268549 INFO Executing command: "D:\kohya_ss\venv\Scripts\accelerate.EXE" launch --dynamo_backend no --dynamo_mode default --mixed_precision fp16 --num_processes 1 --num_machines 1 --num_cpu_threads_per_process 2 "D:/kohya_ss/sd-scripts/train_network.py" --config_file "./outputs/tmpfilelora.toml" with shell=True 19:47:43-275550 INFO Command executed. 2024-04-26 19:47:50 WARNING A matching Triton is not available, some optimizations will not be enabled. __init__.py:55 Error caught was: No module named 'triton' 2024-04-26 19:47:52 INFO Loading settings from ./outputs/tmpfilelora.toml... train_util.py:3744 INFO ./outputs/tmpfilelora train_util.py:3763 2024-04-26 19:47:52 INFO prepare tokenizer train_util.py:4227 2024-04-26 19:47:53 INFO update token length: 75 train_util.py:4244 INFO Using DreamBooth method. train_network.py:172 INFO prepare images. train_util.py:1572 INFO found directory D:\kohya_ss\test\img\10_darius kawasaki person contains 8 image files train_util.py:1519 INFO 80 train images with repeating. train_util.py:1613 INFO 0 reg images. train_util.py:1616 WARNING no regularization images / 正則化画像が見つかりませんでした train_util.py:1621 INFO [Dataset 0] config_util.py:565 batch_size: 1 resolution: (512, 512) enable_bucket: True network_multiplier: 1.0 min_bucket_reso: 256 max_bucket_reso: 2048 bucket_reso_steps: 64 bucket_no_upscale: True [Subset 0 of Dataset 0] image_dir: "D:\kohya_ss\test\img\10_darius kawasaki person" image_count: 8 num_repeats: 10 shuffle_caption: False keep_tokens: 0 keep_tokens_separator: secondary_separator: None enable_wildcard: False caption_dropout_rate: 0.0 caption_dropout_every_n_epoches: 0 caption_tag_dropout_rate: 0.0 caption_prefix: None caption_suffix: None color_aug: False flip_aug: False face_crop_aug_range: None random_crop: False token_warmup_min: 1, token_warmup_step: 0, is_reg: False class_tokens: darius kawasaki person caption_extension: .txt INFO [Dataset 0] config_util.py:571 INFO loading image sizes. train_util.py:853 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:00<00:00, 331.32it/s] INFO make buckets train_util.py:859 WARNING min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is set, because bucket reso is defined by image size automatically / train_util.py:876 bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計算されるため、min_bucket_resoとmax_bucket_resoは無視されます INFO number of images (including repeats) / 各bucketの画像枚数(繰り返し回数を含む) train_util.py:905 INFO bucket 0: resolution (512, 512), count: 80 train_util.py:910 INFO mean ar error (without repeats): 0.0 train_util.py:915 INFO preparing accelerator train_network.py:225 accelerator device: cuda INFO loading model for process 0/1 train_util.py:4385 INFO load Diffusers pretrained models: runwayml/stable-diffusion-v1-5 train_util.py:4347 Loading pipeline components...: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 10.24it/s] You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing `safety_checker=None`. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 . 2024-04-26 19:47:54 INFO UNet2DConditionModel: 64, 8, 768, False, False original_unet.py:1387 2024-04-26 19:48:01 INFO U-Net converted to original U-Net train_util.py:4372 2024-04-26 19:48:02 INFO Enable memory efficient attention for U-Net train_util.py:2657 import network module: networks.lora INFO [Dataset 0] train_util.py:2079 INFO caching latents. train_util.py:974 INFO checking cache validity... train_util.py:984 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:00<?, ?it/s] INFO caching latents... train_util.py:1021 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:02<00:00, 3.35it/s] 2024-04-26 19:48:06 INFO create LoRA network. base dim (rank): 8, alpha: 1 lora.py:810 INFO neuron dropout: p=None, rank dropout: p=None, module dropout: p=None lora.py:811 INFO create LoRA for Text Encoder: lora.py:905 INFO create LoRA for Text Encoder: 72 modules. lora.py:910 INFO create LoRA for U-Net: 192 modules. lora.py:918 INFO enable LoRA for text encoder lora.py:961 INFO enable LoRA for U-Net lora.py:966 INFO CrossAttnDownBlock2D False -> True original_unet.py:1521 INFO CrossAttnDownBlock2D False -> True original_unet.py:1521 INFO CrossAttnDownBlock2D False -> True original_unet.py:1521 INFO DownBlock2D False -> True original_unet.py:1521 INFO UNetMidBlock2DCrossAttn False -> True original_unet.py:1521 INFO UpBlock2D False -> True original_unet.py:1521 INFO CrossAttnUpBlock2D False -> True original_unet.py:1521 INFO CrossAttnUpBlock2D False -> True original_unet.py:1521 INFO CrossAttnUpBlock2D False -> True original_unet.py:1521 prepare optimizer, data loader etc. INFO use Adafactor optimizer | {'relative_step': True} train_util.py:4047 INFO relative_step is true / relative_stepがtrueです train_util.py:4050 WARNING learning rate is used as initial_lr / 指定したlearning rateはinitial_lrとして使用されます train_util.py:4052 WARNING unet_lr and text_encoder_lr are ignored / unet_lrとtext_encoder_lrは無視されます train_util.py:4064 INFO use adafactor_scheduler / スケジューラにadafactor_schedulerを使用します train_util.py:4069 running training / 学習開始 num train images * repeats / 学習画像の数×繰り返し回数: 80 num reg images / 正則化画像の数: 0 num batches per epoch / 1epochのバッチ数: 80 num epochs / epoch数: 20 batch size per device / バッチサイズ: 1 gradient accumulation steps / 勾配を合計するステップ数 = 1 total optimization steps / 学習ステップ数: 1600 steps: 0%| | 0/1600 [00:00<?, ?it/s] epoch 1/20 2024-04-26 19:48:09 WARNING A matching Triton is not available, some optimizations will not be enabled. __init__.py:55 Error caught was: No module named 'triton' D:\kohya_ss\venv\lib\site-packages\torch\utils\checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants. warnings.warn( steps: 2%|██▌ | 26/1600 [00:36<36:49, 1.40s/it, avr_loss=0.198]19:48:44-084123 INFO The running process has been terminated. 19:48:44-978963 INFO Training has ended.
Here is a copy of the json config that use the test images and folders in the kohya_ss folder itself... so they should run as is on your computer. It they don't work then the problem is with the software / drivers installed on your machine.
Otter_20240427-121803.json this is the json i get btw (which im definietly sure that it is the config?) LoraLowVRAMSettings-test.json
seems to be a issue on my end, i appreciate you doing ur best. Have a nice day!
I tried adafactor and AdamW ![Uploading image.png…]()
any help appreciated!