bmaltais / kohya_ss

Apache License 2.0
9.54k stars 1.23k forks source link

An error was reported #1763

Closed zhaowolf closed 8 months ago

zhaowolf commented 10 months ago

18:49:52-833183 INFO Start training LoRA Standard ... 18:49:52-833183 INFO Checking for duplicate image filenames in training data directory... 18:49:52-835184 INFO Valid image folder names found in: F:/kohya_ss/test/MiniGPT-4 18:49:54-613100 INFO Folder 50_MIniGPT-4: 15 images found 18:49:54-614105 INFO Folder 50_MIniGPT-4: 750 steps 18:49:54-615511 INFO Total steps: 750 18:49:54-616515 INFO Train batch size: 1 18:49:54-617572 INFO Gradient accumulation steps: 1 18:49:54-617572 INFO Epoch: 10 18:49:54-619386 INFO Regulatization factor: 1 18:49:54-620394 INFO max_train_steps (750 / 1 / 1 10 1) = 7500 18:49:54-621899 INFO stop_text_encoder_training = 0 18:49:54-622908 INFO lr_warmup_steps = 750 18:49:54-624348 INFO Saving training config to F:/kohya_ss/test/output\MiniGPT-4_20231212-184954.json... 18:49:54-626356 INFO accelerate launch --num_cpu_threads_per_process=2 "./train_network.py" --enable_bucket --min_bucket_reso=256 --max_bucket_reso=2048 --pretrained_model_name_or_path="F:/Stable Diffusion/sd-webui-aki-v4.4/models/Stable-diffusion/NB类/chilloutmix_NiPrunedFp32Fix.safetensors" --train_data_dir="F:/kohya_ss/test/MiniGPT-4" --resolution="704,1024" --output_dir="F:/kohya_ss/test/output" --logging_dir="F:/kohya_ss/test/log" --network_alpha="64" --save_model_as=safetensors --network_module=networks.lora --text_encoder_lr=8e-06 --unet_lr=7e-05 --network_dim=128 --output_name="MiniGPT-4" --lr_scheduler_num_cycles="10" --no_half_vae --learning_rate="1e-05" --lr_scheduler="cosine_with_restarts" --lr_warmup_steps="750" --train_batch_size="1" --max_train_steps="7500" --save_every_n_epochs="1" --mixed_precision="fp16" --save_precision="fp16" --cache_latents --optimizer_type="AdamW8bit" --max_data_loader_n_workers="0" --bucket_reso_steps=64 --xformers --bucket_no_upscale --noise_offset=0.0 prepare tokenizer Using DreamBooth method. prepare images. found directory F:\kohya_ss\test\MiniGPT-4\50_MIniGPT-4 contains 15 image files No caption file found for 15 images. Training will continue without captions for these images. If class token exists, it will be used. / 15枚の画像にキャプションファイルが見つかりませんでした。これらの画像についてはキャプションなしで学習を続行します。class tokenが存在する場合はそれを使います。 F:\kohya_ss\test\MiniGPT-4\50_MIniGPT-4\001 (1).png F:\kohya_ss\test\MiniGPT-4\50_MIniGPT-4\001 (10).png F:\kohya_ss\test\MiniGPT-4\50_MIniGPT-4\001 (11).png F:\kohya_ss\test\MiniGPT-4\50_MIniGPT-4\001 (12).png F:\kohya_ss\test\MiniGPT-4\50_MIniGPT-4\001 (13).png F:\kohya_ss\test\MiniGPT-4\50_MIniGPT-4\001 (14).png... and 10 more 750 train images with repeating. 0 reg images. no regularization images / 正則化画像が見つかりませんでした [Dataset 0] batch_size: 1 resolution: (704, 1024) enable_bucket: True min_bucket_reso: 256 max_bucket_reso: 2048 bucket_reso_steps: 64 bucket_no_upscale: True

[Subset 0 of Dataset 0] image_dir: "F:\kohya_ss\test\MiniGPT-4\50_MIniGPT-4" image_count: 15 num_repeats: 50 shuffle_caption: False keep_tokens: 0 caption_dropout_rate: 0.0 caption_dropout_every_n_epoches: 0 caption_tag_dropout_rate: 0.0 caption_prefix: None caption_suffix: None color_aug: False flip_aug: False face_crop_aug_range: None random_crop: False token_warmup_min: 1, token_warmup_step: 0, is_reg: False class_tokens: MIniGPT-4 caption_extension: .caption

[Dataset 0] loading image sizes. 100%|████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:00<00:00, 2994.08it/s] make buckets min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is set, because bucket reso is defined by image size automatically / bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計算されるため、min_bucket_resoとmax_bucket_resoは無視されます number of images (including repeats) / 各bucketの画像枚数(繰り返し回数を含む) bucket 0: resolution (704, 896), count: 450 bucket 1: resolution (1024, 704), count: 300 mean ar error (without repeats): 0.008304761904761941 preparing accelerator loading model for process 0/1 load StableDiffusion checkpoint: F:/Stable Diffusion/sd-webui-aki-v4.4/models/Stable-diffusion/NB类/chilloutmix_NiPrunedFp32Fix.safetensors UNet2DConditionModel: 64, 8, 768, False, False loading u-net: loading vae: loading text encoder: Enable xformers for U-Net A matching Triton is not available, some optimizations will not be enabled. Error caught was: No module named 'triton' import network module: networks.lora [Dataset 0] caching latents. checking cache validity... 100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:00<?, ?it/s] caching latents... 100%|██████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:09<00:00, 1.54it/s] create LoRA network. base dim (rank): 128, alpha: 64.0 neuron dropout: p=None, rank dropout: p=None, module dropout: p=None create LoRA for Text Encoder: create LoRA for Text Encoder: 72 modules. create LoRA for U-Net: 192 modules. enable LoRA for text encoder enable LoRA for U-Net prepare optimizer, data loader etc.

===================================BUG REPORT=================================== Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues For effortless bug reporting copy-paste your error into this form: https://docs.google.com/forms/d/e/1FAIpQLScPB8emS3Thkp66nvqwmjTEgxp8Y9ufuWTzFyr9kJ5AoI47dQ/viewform?usp=sf_link

CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64... C:\Program Files\Python310\lib\site-packages\bitsandbytes\cuda_setup\paths.py:27: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {WindowsPath('/usr/local/cuda/lib64')} warn( WARNING: No libcudart.so found! Install CUDA or the cudatoolkit package (anaconda)! CUDA SETUP: Loading binary C:\Program Files\Python310\lib\site-packages\bitsandbytes\libbitsandbytes_cpu.so... Traceback (most recent call last): File "F:\kohya_ss\train_network.py", line 1012, in trainer.train(args) File "F:\kohya_ss\train_network.py", line 342, in train optimizer_name, optimizer_args, optimizer = train_util.get_optimizer(args, trainable_params) File "F:\kohya_ss\library\train_util.py", line 3444, in get_optimizer import bitsandbytes as bnb File "C:\Program Files\Python310\lib\site-packages\bitsandbytes__init.py", line 6, in from .autograd._functions import ( File "C:\Program Files\Python310\lib\site-packages\bitsandbytes\autograd_functions.py", line 5, in import bitsandbytes.functional as F File "C:\Program Files\Python310\lib\site-packages\bitsandbytes\functional.py", line 13, in from .cextension import COMPILED_WITH_CUDA, lib File "C:\Program Files\Python310\lib\site-packages\bitsandbytes\cextension.py", line 41, in lib = CUDALibrary_Singleton.get_instance().lib File "C:\Program Files\Python310\lib\site-packages\bitsandbytes\cextension.py", line 37, in get_instance cls._instance.initialize() File "C:\Program Files\Python310\lib\site-packages\bitsandbytes\cextension.py", line 31, in initialize self.lib = ct.cdll.LoadLibrary(binary_path) File "C:\Program Files\Python310\lib\ctypes__init__.py", line 452, in LoadLibrary return self._dlltype(name) File "C:\Program Files\Python310\lib\ctypes\init.py", line 364, in init if '/' in name or '\' in name: TypeError: argument of type 'WindowsPath' is not iterable Traceback (most recent call last): File "C:\Program Files\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Program Files\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "C:\Program Files\Python310\Scripts\accelerate.exe\main__.py", line 7, in File "C:\Program Files\Python310\lib\site-packages\accelerate\commands\accelerate_cli.py", line 47, in main args.func(args) File "C:\Program Files\Python310\lib\site-packages\accelerate\commands\launch.py", line 986, in launch_command simple_launcher(args) File "C:\Program Files\Python310\lib\site-packages\accelerate\commands\launch.py", line 628, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['C:\Program Files\Python310\python.exe', './train_network.py', '--enable_bucket', '--min_bucket_reso=256', '--max_bucket_reso=2048', '--pretrained_model_name_or_path=F:/Stable Diffusion/sd-webui-aki-v4.4/models/Stable-diffusion/NB类/chilloutmix_NiPrunedFp32Fix.safetensors', '--train_data_dir=F:/kohya_ss/test/MiniGPT-4', '--resolution=704,1024', '--output_dir=F:/kohya_ss/test/output', '--logging_dir=F:/kohya_ss/test/log', '--network_alpha=64', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=8e-06', '--unet_lr=7e-05', '--network_dim=128', '--output_name=MiniGPT-4', '--lr_scheduler_num_cycles=10', '--no_half_vae', '--learning_rate=1e-05', '--lr_scheduler=cosine_with_restarts', '--lr_warmup_steps=750', '--train_batch_size=1', '--max_train_steps=7500', '--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--cache_latents', '--optimizer_type=AdamW8bit', '--max_data_loader_n_workers=0', '--bucket_reso_steps=64', '--xformers', '--bucket_no_upscale', '--noise_offset=0.0']' returned non-zero exit status 1.

tjip1234 commented 10 months ago

Be sure to install CUDA the way is described in guide, if that doesn't work what is the output of nvcc --version ?

zhaowolf commented 10 months ago

thank you very much for your answer