bmaltais / kohya_ss

Apache License 2.0
9.6k stars 1.24k forks source link

Lora training error on Runpod (v24.0.3) #2324

Closed orangemagic123 closed 6 months ago

orangemagic123 commented 6 months ago

07:06:16-302330 INFO Start training LoRA LyCORIS/LoCon ...
07:06:16-306596 INFO Validating model file or folder path
/workspace/stable-diffusion-webui-forge/models/Stable-d iffusion/animagine-xl-3.1.safetensors existence...
07:06:16-313774 INFO ...valid
07:06:16-317207 INFO Validating output_dir path
/workspace/stable-diffusion-webui-forge/models/Lora/law rence83 existence...
07:06:16-322670 INFO ...valid
07:06:16-326082 INFO Validating train_data_dir path
/workspace/kohya_ss/trainimg existence...
07:06:16-331125 INFO ...valid
07:06:16-334145 INFO reg_data_dir not specified, skipping validation
07:06:16-338203 INFO Validating logging_dir path
/workspace/stable-diffusion-webui-forge/models/Lora/law rence83 existence...
07:06:16-342492 INFO ...valid
07:06:16-345762 INFO log_tracker_config not specified, skipping validation
07:06:16-349161 INFO resume not specified, skipping validation
07:06:16-352572 INFO Validating vae path
/workspace/stable-diffusion-webui-forge/models/VAE/sdxl _vae.safetensors existence...
07:06:16-357764 INFO ...valid
07:06:16-361144 INFO lora_network_weights not specified, skipping validation 07:06:16-364665 INFO dataset_config not specified, skipping validation
07:06:16-368293 INFO Headless mode, skipping verification if model already
exist... if model already exist it will be
overwritten...
07:06:16-374849 INFO Folder 14_style: 14 repeats found
07:06:16-380545 INFO Folder 14_style: 13 images found
07:06:16-384107 INFO Folder 14_style: 13 14 = 182 steps
07:06:16-387875 INFO Folder 8_1girl: 8 repeats found
07:06:16-393226 INFO Folder 8_1girl: 22 images found
07:06:16-397797 INFO Folder 8_1girl: 22
8 = 176 steps
07:06:16-405109 INFO Error: '.ipynb_checkpoints' does not contain an
underscore, skipping...
07:06:16-409680 INFO Folder 5_1boy: 5 repeats found
07:06:16-414678 INFO Folder 5_1boy: 38 images found
07:06:16-418377 INFO Folder 5_1boy: 38 5 = 190 steps
07:06:16-422166 INFO Regulatization factor: 1
07:06:16-425770 INFO Total steps: 548
07:06:16-429198 INFO Train batch size: 1
07:06:16-433109 INFO Gradient accumulation steps: 1
07:06:16-436683 INFO Epoch: 15
07:06:16-440203 INFO max_train_steps (548 / 1 / 1
15 * 1) = 8220
07:06:16-444776 INFO stop_text_encoder_training = 0
07:06:16-448317 INFO lr_warmup_steps = 0
07:06:16-466533 INFO Saving training config to
/workspace/stable-diffusion-webui-forge/models/Lora/law rence83/lawrence83_20240418-070616.json...
07:06:16-476560 INFO Executing command:
"/workspace/kohya_ss/venv/bin/accelerate" launch
--dynamo_backend no --dynamo_mode default
--mixed_precision bf16 --num_processes 1 --num_machines 1 --num_cpu_threads_per_process 2
"/workspace/kohya_ss/sd-scripts/sdxl_train_network.py" --config_file "./outputs/tmpfilelora.toml" with
shell=True
07:06:16-485254 INFO Command executed.
2024-04-18 07:07:07.229839: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-04-18 07:07:07.229899: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-04-18 07:07:07.234279: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2024-04-18 07:07:07.630309: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2024-04-18 07:07:15.064421: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT 2024-04-18 07:07:26 INFO Loading settings from ]8;id=798198;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=963517;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#3744\3744]8;;\ ./outputs/tmpfilelora.toml...
INFO ./outputs/tmpfilelora ]8;id=417019;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=91822;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#3763\3763]8;;\ 2024-04-18 07:07:26 INFO prepare tokenizers ]8;id=263666;file:///workspace/kohya_ss/sd-scripts/library/sdxl_train_util.py\sdxl_train_util.py]8;;\:]8;id=779058;file:///workspace/kohya_ss/sd-scripts/library/sdxl_train_util.py#134\134]8;;\ 2024-04-18 07:07:27 INFO update token length: 75 ]8;id=898717;file:///workspace/kohya_ss/sd-scripts/library/sdxl_train_util.py\sdxl_train_util.py]8;;\:]8;id=964402;file:///workspace/kohya_ss/sd-scripts/library/sdxl_train_util.py#159\159]8;;\ INFO Using DreamBooth method. ]8;id=673481;file:///workspace/kohya_ss/sd-scripts/train_network.py\train_network.py]8;;\:]8;id=207907;file:///workspace/kohya_ss/sd-scripts/train_network.py#172\172]8;;\ WARNING ignore directory without repeats ]8;id=591340;file:///workspace/kohya_ss/sd-scripts/library/config_util.py\config_util.py]8;;\:]8;id=866346;file:///workspace/kohya_ss/sd-scripts/library/config_util.py#584\584]8;;\ /
繰り返し回数のないディレクトリを
無視します: .ipynb_checkpoints
2024-04-18 07:07:28 INFO prepare images. ]8;id=954974;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=421642;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#1572\1572]8;;\ INFO found directory ]8;id=370188;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=979148;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#1519\1519]8;;\ /workspace/kohyass/trainimg/14
style contains 13 image files
INFO found directory ]8;id=776602;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=627238;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#1519\1519]8;;\ /workspace/kohya_ss/trainimg/8_1
girl contains 22 image files
INFO found directory ]8;id=380418;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=578653;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#1519\1519]8;;\ /workspace/kohya_ss/trainimg/5_1
boy contains 38 image files
INFO 548 train images with repeating. ]8;id=909356;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=34642;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#1613\1613]8;;\ INFO 0 reg images. ]8;id=825699;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=327028;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#1616\1616]8;;\ WARNING no regularization images / ]8;id=46629;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=414295;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#1621\1621]8;;\ 正則化画像が見つかりませんでした
INFO [Dataset 0] ]8;id=10683;file:///workspace/kohya_ss/sd-scripts/library/config_util.py\config_util.py]8;;\:]8;id=287163;file:///workspace/kohya_ss/sd-scripts/library/config_util.py#565\565]8;;\ batch_size: 1
resolution: (1024, 1024)
enable_bucket: True
network_multiplier: 1.0
min_bucket_reso: 256
max_bucket_reso: 2048
bucket_reso_steps: 64
bucket_no_upscale: True

                           [Subset 0 of Dataset 0]                          
                             image_dir:                                     
                         "/workspace/kohya_ss/trainimg/14                   
                         _style"                                            
                             image_count: 13                                
                             num_repeats: 14                                
                             shuffle_caption: True                          
                             keep_tokens: 1                                 
                             keep_tokens_separator:                         
                             secondary_separator: None                      
                             enable_wildcard: False                         
                             caption_dropout_rate: 0                        
                             caption_dropout_every_n_epoc                   
                         hes: 0                                             
                             caption_tag_dropout_rate:                      
                         0.0                                                
                             caption_prefix: None                           
                             caption_suffix: None                           
                             color_aug: False                               
                             flip_aug: False                                
                             face_crop_aug_range: None                      
                             random_crop: False                             
                             token_warmup_min: 1,                           
                             token_warmup_step: 0,                          
                             is_reg: False                                  
                             class_tokens: style                            
                             caption_extension: .txt                        

                           [Subset 1 of Dataset 0]                          
                             image_dir:                                     
                         "/workspace/kohya_ss/trainimg/8_                   
                         1girl"                                             
                             image_count: 22                                
                             num_repeats: 8                                 
                             shuffle_caption: True                          
                             keep_tokens: 1                                 
                             keep_tokens_separator:                         
                             secondary_separator: None                      
                             enable_wildcard: False                         
                             caption_dropout_rate: 0                        
                             caption_dropout_every_n_epoc                   
                         hes: 0                                             
                             caption_tag_dropout_rate:                      
                         0.0                                                
                             caption_prefix: None                           
                             caption_suffix: None                           
                             color_aug: False                               
                             flip_aug: False                                
                             face_crop_aug_range: None                      
                             random_crop: False                             
                             token_warmup_min: 1,                           
                             token_warmup_step: 0,                          
                             is_reg: False                                  
                             class_tokens: 1girl                            
                             caption_extension: .txt                        

                           [Subset 2 of Dataset 0]                          
                             image_dir:                                     
                         "/workspace/kohya_ss/trainimg/5_                   
                         1boy"                                              
                             image_count: 38                                
                             num_repeats: 5                                 
                             shuffle_caption: True                          
                             keep_tokens: 1                                 
                             keep_tokens_separator:                         
                             secondary_separator: None                      
                             enable_wildcard: False                         
                             caption_dropout_rate: 0                        
                             caption_dropout_every_n_epoc                   
                         hes: 0                                             
                             caption_tag_dropout_rate:                      
                         0.0                                                
                             caption_prefix: None                           
                             caption_suffix: None                           
                             color_aug: False                               
                             flip_aug: False                                
                             face_crop_aug_range: None                      
                             random_crop: False                             
                             token_warmup_min: 1,                           
                             token_warmup_step: 0,                          
                             is_reg: False                                  
                             class_tokens: 1boy                             
                             caption_extension: .txt                        

                INFO     [Dataset 0]                      ]8;id=214436;file:///workspace/kohya_ss/sd-scripts/library/config_util.py\config_util.py]8;;\:]8;id=246189;file:///workspace/kohya_ss/sd-scripts/library/config_util.py#571\571]8;;\
                INFO     loading image sizes.              ]8;id=62091;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=591111;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#853\853]8;;\

100%|██████████████████████████████████████████| 73/73 [00:00<00:00, 258.04it/s] 2024-04-18 07:07:29 INFO make buckets ]8;id=385800;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=679071;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#859\859]8;;\ WARNING min_bucket_reso and ]8;id=890060;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=921091;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#876\876]8;;\ max_bucket_reso are ignored if
bucket_no_upscale is set, because
bucket reso is defined by image
size automatically /
bucket_no_upscaleが指定された場合
は、bucketの解像度は画像サイズか
ら自動計算されるため、minbucket
resoとmax_bucket_resoは無視されま

INFO number of images (including ]8;id=454489;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=562005;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#905\905]8;;\ repeats) /
各bucketの画像枚数(繰り返し回数
を含む)
INFO bucket 0: resolution (576, 1536), ]8;id=535608;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=345420;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#910\910]8;;\ count: 5
INFO bucket 1: resolution (640, 1408), ]8;id=536078;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=806976;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#910\910]8;;\ count: 5
INFO bucket 2: resolution (640, 1536), ]8;id=560315;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=953669;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#910\910]8;;\ count: 5
INFO bucket 3: resolution (704, 1344), ]8;id=112323;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=922669;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#910\910]8;;\ count: 13
INFO bucket 4: resolution (768, 1152), ]8;id=179734;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=426571;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#910\910]8;;\ count: 5
INFO bucket 5: resolution (768, 1216), ]8;id=425449;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=216117;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#910\910]8;;\ count: 5
INFO bucket 6: resolution (768, 1344), ]8;id=495191;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=87254;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#910\910]8;;\ count: 5
INFO bucket 7: resolution (832, 1152), ]8;id=639057;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=538543;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#910\910]8;;\ count: 267
INFO bucket 8: resolution (832, 1216), ]8;id=693843;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=616588;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#910\910]8;;\ count: 125
INFO bucket 9: resolution (896, 1024), ]8;id=294031;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=687572;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#910\910]8;;\ count: 8
INFO bucket 10: resolution (1024, ]8;id=54642;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=646473;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#910\910]8;;\ 1024), count: 77
INFO bucket 11: resolution (1152, ]8;id=902115;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=927827;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#910\910]8;;\ 832), count: 14
INFO bucket 12: resolution (1728, ]8;id=225006;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=725034;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#910\910]8;;\ 576), count: 14
INFO mean ar error (without repeats): ]8;id=655369;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=650369;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#915\915]8;;\ 0.006797034517854034
WARNING clip_skip will be unexpected ]8;id=335121;file:///workspace/kohya_ss/sd-scripts/library/sdxl_train_util.py\sdxl_train_util.py]8;;\:]8;id=455966;file:///workspace/kohya_ss/sd-scripts/library/sdxl_train_util.py#343\343]8;;\ /
SDXL学習ではclip_skipは動作
しません
INFO preparing accelerator ]8;id=567358;file:///workspace/kohya_ss/sd-scripts/train_network.py\train_network.py]8;;\:]8;id=827366;file:///workspace/kohya_ss/sd-scripts/train_network.py#225\225]8;;\ accelerator device: cuda INFO loading model for process 0/1 ]8;id=539005;file:///workspace/kohya_ss/sd-scripts/library/sdxl_train_util.py\sdxl_train_util.py]8;;\:]8;id=634042;file:///workspace/kohya_ss/sd-scripts/library/sdxl_train_util.py#30\30]8;;\ INFO load StableDiffusion ]8;id=243047;file:///workspace/kohya_ss/sd-scripts/library/sdxl_train_util.py\sdxl_train_util.py]8;;\:]8;id=17362;file:///workspace/kohya_ss/sd-scripts/library/sdxl_train_util.py#70\70]8;;\ checkpoint:
/workspace/stable-diffusion-w
ebui-forge/models/Stable-diff
usion/animagine-xl-3.1.safete
nsors
2024-04-18 07:07:31 INFO building U-Net ]8;id=768485;file:///workspace/kohya_ss/sd-scripts/library/sdxl_model_util.py\sdxl_model_util.py]8;;\:]8;id=434802;file:///workspace/kohya_ss/sd-scripts/library/sdxl_model_util.py#192\192]8;;\ 2024-04-18 07:07:32 INFO loading U-Net from ]8;id=691662;file:///workspace/kohya_ss/sd-scripts/library/sdxl_model_util.py\sdxl_model_util.py]8;;\:]8;id=6752;file:///workspace/kohya_ss/sd-scripts/library/sdxl_model_util.py#196\196]8;;\ checkpoint
2024-04-18 07:07:49 INFO U-Net: <All keys matched ]8;id=52188;file:///workspace/kohya_ss/sd-scripts/library/sdxl_model_util.py\sdxl_model_util.py]8;;\:]8;id=127710;file:///workspace/kohya_ss/sd-scripts/library/sdxl_model_util.py#202\202]8;;\ successfully>
INFO building text encoders ]8;id=592096;file:///workspace/kohya_ss/sd-scripts/library/sdxl_model_util.py\sdxl_model_util.py]8;;\:]8;id=376338;file:///workspace/kohya_ss/sd-scripts/library/sdxl_model_util.py#205\205]8;;\ 2024-04-18 07:07:50 INFO loading text encoders from ]8;id=692698;file:///workspace/kohya_ss/sd-scripts/library/sdxl_model_util.py\sdxl_model_util.py]8;;\:]8;id=295873;file:///workspace/kohya_ss/sd-scripts/library/sdxl_model_util.py#258\258]8;;\ checkpoint
INFO text encoder 1: <All keys ]8;id=67934;file:///workspace/kohya_ss/sd-scripts/library/sdxl_model_util.py\sdxl_model_util.py]8;;\:]8;id=543323;file:///workspace/kohya_ss/sd-scripts/library/sdxl_model_util.py#272\272]8;;\ matched successfully>
2024-04-18 07:07:55 INFO text encoder 2: <All keys ]8;id=659042;file:///workspace/kohya_ss/sd-scripts/library/sdxl_model_util.py\sdxl_model_util.py]8;;\:]8;id=503599;file:///workspace/kohya_ss/sd-scripts/library/sdxl_model_util.py#276\276]8;;\ matched successfully>
INFO building VAE ]8;id=16251;file:///workspace/kohya_ss/sd-scripts/library/sdxl_model_util.py\sdxl_model_util.py]8;;\:]8;id=579991;file:///workspace/kohya_ss/sd-scripts/library/sdxl_model_util.py#279\279]8;;\ INFO loading VAE from checkpoint ]8;id=26514;file:///workspace/kohya_ss/sd-scripts/library/sdxl_model_util.py\sdxl_model_util.py]8;;\:]8;id=851439;file:///workspace/kohya_ss/sd-scripts/library/sdxl_model_util.py#284\284]8;;\ 2024-04-18 07:07:56 INFO VAE: <All keys matched ]8;id=535271;file:///workspace/kohya_ss/sd-scripts/library/sdxl_model_util.py\sdxl_model_util.py]8;;\:]8;id=80170;file:///workspace/kohya_ss/sd-scripts/library/sdxl_model_util.py#287\287]8;;\ successfully>
INFO load VAE: ]8;id=520866;file:///workspace/kohya_ss/sd-scripts/library/model_util.py\model_util.py]8;;\:]8;id=597161;file:///workspace/kohya_ss/sd-scripts/library/model_util.py#1268\1268]8;;\ /workspace/stable-diffusion-webu
i-forge/models/VAE/sdxl_vae.safe
tensors
2024-04-18 07:07:59 INFO additional VAE loaded ]8;id=138546;file:///workspace/kohya_ss/sd-scripts/library/sdxl_train_util.py\sdxl_train_util.py]8;;\:]8;id=855820;file:///workspace/kohya_ss/sd-scripts/library/sdxl_train_util.py#128\128]8;;\ INFO Enable xformers for U-Net ]8;id=619737;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=138979;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#2660\2660]8;;\ import network module: lycoris.kohya False

===================================BUG REPORT=================================== /workspace/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:183: UserWarning: Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

warn(msg)

The following directories listed in your path were found to be non-existent: {PosixPath('/usr/local/nvidia/lib'), PosixPath('/usr/local/nvidia/lib64'), PosixPath('/workspace/kohya_ss/venv/lib/python3.10/site-packages/cv2/../../lib64')} /workspace/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:183: UserWarning: /workspace/kohya_ss/venv/lib/python3.10/site-packages/cv2/../../lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64 did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0', 'libcudart.so.12.1', 'libcudart.so.12.2'] as expected! Searching further paths... warn(msg) The following directories listed in your path were found to be non-existent: {PosixPath('module'), PosixPath('//matplotlib_inline.backend_inline')} The following directories listed in your path were found to be non-existent: {PosixPath('mykey myaccount')} CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths... DEBUG: Possible options found for libcudart.so: {PosixPath('/usr/local/cuda/lib64/libcudart.so')} CUDA SETUP: PyTorch settings found: CUDA_VERSION=118, Highest Compute Capability: 8.6. CUDA SETUP: To manually override the PyTorch CUDA version please see:https://github.com/TimDettmers/bitsandbytes/blob/main/how_to_use_nonpytorch_cuda.md CUDA SETUP: Loading binary /workspace/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda118.so... libcusparse.so.11: cannot open shared object file: No such file or directory CUDA SETUP: Something unexpected happened. Please compile from source: git clone https://github.com/TimDettmers/bitsandbytes.git cd bitsandbytes CUDA_VERSION=118 make cuda11x python setup.py install INFO [Dataset 0] ]8;id=645940;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=874442;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#2079\2079]8;;\ INFO caching latents. ]8;id=477734;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=323694;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#974\974]8;;\ INFO checking cache validity... ]8;id=692950;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=921780;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#984\984]8;;\ 100%|██████████████████████████████████████████| 73/73 [00:00<00:00, 106.92it/s] 2024-04-18 07:08:00 INFO caching latents... ]8;id=572552;file:///workspace/kohya_ss/sd-scripts/library/train_util.py\train_util.py]8;;\:]8;id=329193;file:///workspace/kohya_ss/sd-scripts/library/train_util.py#1021\1021]8;;\ 0%| | 0/24 [00:00<?, ?it/s] Traceback (most recent call last): File "/workspace/kohya_ss/venv/lib/python3.10/site-packages/PIL/ImageFile.py", line 271, in load s = read(self.decodermaxblock) File "/workspace/kohya_ss/venv/lib/python3.10/site-packages/PIL/PngImagePlugin.py", line 932, in load_read cid, pos, length = self.png.read() File "/workspace/kohya_ss/venv/lib/python3.10/site-packages/PIL/PngImagePlugin.py", line 167, in read length = i32(s) File "/workspace/kohya_ss/venv/lib/python3.10/site-packages/PIL/_binary.py", line 95, in i32be return unpack_from(">I", c, o)[0] struct.error: unpack_from requires a buffer of at least 4 bytes for unpacking 4 bytes at offset 0 (actual buffer size is 0)

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/workspace/kohya_ss/sd-scripts/sdxl_train_network.py", line 185, in trainer.train(args) File "/workspace/kohya_ss/sd-scripts/train_network.py", line 272, in train train_dataset_group.cache_latents(vae, args.vae_batch_size, args.cache_latents_to_disk, accelerator.is_main_process) File "/workspace/kohya_ss/sd-scripts/library/train_util.py", line 2080, in cache_latents dataset.cache_latents(vae, vae_batch_size, cache_to_disk, is_main_process) File "/workspace/kohya_ss/sd-scripts/library/train_util.py", line 1023, in cache_latents cache_batch_latents(vae, cache_to_disk, batch, subset.flip_aug, subset.random_crop) File "/workspace/kohya_ss/sd-scripts/library/train_util.py", line 2403, in cache_batch_latents image = load_image(info.absolute_path) if info.image is None else np.array(info.image, np.uint8) File "/workspace/kohya_ss/sd-scripts/library/train_util.py", line 2352, in load_image img = np.array(image, np.uint8) File "/workspace/kohya_ss/venv/lib/python3.10/site-packages/PIL/Image.py", line 696, in __array_interface__ new["data"] = self.tobytes() File "/workspace/kohya_ss/venv/lib/python3.10/site-packages/PIL/Image.py", line 755, in tobytes self.load() File "/workspace/kohya_ss/venv/lib/python3.10/site-packages/PIL/ImageFile.py", line 278, in load raise OSError(msg) from e OSError: image file is truncated Traceback (most recent call last): File "/workspace/kohya_ss/venv/bin/accelerate", line 8, in sys.exit(main()) File "/workspace/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 47, in main args.func(args) File "/workspace/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1017, in launch_command simple_launcher(args) File "/workspace/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 637, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['/workspace/kohya_ss/venv/bin/python', '/workspace/kohya_ss/sd-scripts/sdxl_train_network.py', '--config_file', './outputs/tmpfilelora.toml']' returned non-zero exit status 1. 07:08:05-395512 INFO Training has ended.

orangemagic123 commented 6 months ago

solved

zhinangubei commented 5 months ago

how to solved it,i meet the same problem

orangemagic123 commented 5 months ago

how to solved it,i meet the same problem

in my case, images are not uploaded completely. open image files one by one in jupyter notebook. some images may be cut. delete and upload again the images.