kohya-ss / sd-scripts

Apache License 2.0
5.33k stars 881 forks source link

Unable to Detect Images in Kohya_SS Fine-Tuning on RunPod - Need Assistance #1736

Closed Artemis1111 closed 4 weeks ago

Artemis1111 commented 1 month ago
09:08:32-695749 INFO     Start Finetuning...                                    
09:08:32-697827 INFO     Validating lr scheduler arguments...                   
09:08:32-699710 INFO     Validating optimizer arguments...                      
09:08:32-703569 INFO     Validating /workspace/finetune/image existence...      
                         SUCCESS                                                
09:08:32-705821 INFO     Validating /workspace/finetune/log existence and       
                         writability... SUCCESS                                 
09:08:32-708157 INFO     Validating /workspace/finetune/model existence and     
                         writability... SUCCESS                                 
09:08:32-710750 INFO     Validating /workspace/model/flux1-dev.safetensors      
                         existence... SUCCESS                                   
09:08:32-712824 INFO     /workspace/kohya_ss/venv/bin/python                    
                         /workspace/kohya_ss/sd-scripts/finetune/merge_captions_
                         to_metadata.py --caption_extension .txt                
                         /workspace/finetune/image meta_cap.json --full_path    
/workspace/kohya_ss/venv/lib/python3.10/site-packages/diffusers/utils/outputs.py:63: FutureWarning: `torch.utils._pytree._register_pytree_node` is deprecated. Please use `torch.utils._pytree.register_pytree_node` instead.
  torch.utils._pytree._register_pytree_node(
2024-10-29 09:08:43.842657: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-10-29 09:08:43.873863: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-10-29 09:08:43.873911: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-10-29 09:08:43.873955: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-10-29 09:08:43.880601: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-10-29 09:08:46.620549: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
/workspace/kohya_ss/venv/lib/python3.10/site-packages/diffusers/utils/outputs.py:63: FutureWarning: `torch.utils._pytree._register_pytree_node` is deprecated. Please use `torch.utils._pytree.register_pytree_node` instead.
  torch.utils._pytree._register_pytree_node(
2024-10-29 09:08:55 INFO     found 0 images.    ]8;id=638801;\merge_captions_to_metadata.py]8;;\:]8;id=416548;\23]8;;\
                    INFO     loading existing   ]8;id=922012;\merge_captions_to_metadata.py]8;;\:]8;id=399916;\29]8;;\
                             metadata:                                          
                             meta_cap.json                                      
                    WARNING  captions for       ]8;id=814066;\merge_captions_to_metadata.py]8;;\:]8;id=748578;\31]8;;\
                             existing images                                    
                             will be                                            
                             overwritten /                                      
                             既存の画像のキャプ                                 
                             ションは上書きされ                                 
                             ます                                               
                    INFO     merge caption      ]8;id=302716;\merge_captions_to_metadata.py]8;;\:]8;id=187859;\36]8;;\
                             texts to metadata                                  
                             json.                                              
0it [00:00, ?it/s]
                    INFO     writing metadata:  ]8;id=464358;\merge_captions_to_metadata.py]8;;\:]8;id=857842;\53]8;;\
                             meta_cap.json                                      
                    INFO     done!              ]8;id=153694;\merge_captions_to_metadata.py]8;;\:]8;id=381527;\55]8;;\
09:08:56-474847 INFO     /workspace/kohya_ss/venv/bin/python                    
                         /workspace/kohya_ss/sd-scripts/finetune/prepare_buckets
                         _latents.py /workspace/finetune/image meta_cap.json    
                         meta_lat.json /workspace/model/flux1-dev.safetensors   
                         --batch_size 1 --max_resolution 1500,1000              
                         --min_bucket_reso 512 --max_bucket_reso 2048           
                         --mixed_precision fp16 --full_path                     
/workspace/kohya_ss/venv/lib/python3.10/site-packages/diffusers/utils/outputs.py:63: FutureWarning: `torch.utils._pytree._register_pytree_node` is deprecated. Please use `torch.utils._pytree.register_pytree_node` instead.
  torch.utils._pytree._register_pytree_node(
/workspace/kohya_ss/venv/lib/python3.10/site-packages/diffusers/utils/outputs.py:63: FutureWarning: `torch.utils._pytree._register_pytree_node` is deprecated. Please use `torch.utils._pytree.register_pytree_node` instead.
  torch.utils._pytree._register_pytree_node(
2024-10-29 09:09:10.761764: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-10-29 09:09:10.791000: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-10-29 09:09:10.791044: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-10-29 09:09:10.791081: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-10-29 09:09:10.797377: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-10-29 09:09:14.533392: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
/workspace/kohya_ss/venv/lib/python3.10/site-packages/diffusers/utils/outputs.py:63: FutureWarning: `torch.utils._pytree._register_pytree_node` is deprecated. Please use `torch.utils._pytree.register_pytree_node` instead.
  torch.utils._pytree._register_pytree_node(
get_preferred_device() -> cuda
2024-10-29 09:09:20 INFO     found 0 images.       ]8;id=307420;\prepare_buckets_latents.py]8;;\:]8;id=772842;\73]8;;\
                    INFO     loading existing      ]8;id=683195;\prepare_buckets_latents.py]8;;\:]8;id=199653;\76]8;;\
                             metadata:                                          
                             meta_cap.json                                      
                    INFO     load VAE:                        ]8;id=193898;\model_util.py]8;;\:]8;id=655963;\1268]8;;\
                             /workspace/model/flux1-dev.safet                   
                             ensors                                             
Traceback (most recent call last):
  File "/workspace/kohya_ss/sd-scripts/finetune/prepare_buckets_latents.py", line 286, in <module>
    main(args)
  File "/workspace/kohya_ss/sd-scripts/finetune/prepare_buckets_latents.py", line 89, in main
    vae = model_util.load_vae(args.model_name_or_path, weight_dtype)
  File "/workspace/kohya_ss/sd-scripts/library/model_util.py", line 1304, in load_vae
    converted_vae_checkpoint = convert_ldm_vae_checkpoint(vae_sd, vae_config)
  File "/workspace/kohya_ss/sd-scripts/library/model_util.py", line 415, in convert_ldm_vae_checkpoint
    new_checkpoint["encoder.conv_in.weight"] = vae_state_dict["encoder.conv_in.weight"]
KeyError: 'encoder.conv_in.weight'
09:09:20-950813 INFO     image_num = 0                                          
09:09:20-952208 INFO     repeats = 0                                            
09:09:20-953028 INFO     Max train steps: 0. sd-scripts will therefore default  
                         to 1600. Please specify a different value if required. 
Traceback (most recent call last):
  File "/workspace/kohya_ss/venv/lib/python3.10/site-packages/gradio/queueing.py", line 622, in process_events
    response = await route_utils.call_process_api(
  File "/workspace/kohya_ss/venv/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
    output = await app.get_blocks().process_api(
  File "/workspace/kohya_ss/venv/lib/python3.10/site-packages/gradio/blocks.py", line 2016, in process_api
    result = await self.call_function(
  File "/workspace/kohya_ss/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1569, in call_function
    prediction = await anyio.to_thread.run_sync(  # type: ignore
  File "/workspace/kohya_ss/venv/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/workspace/kohya_ss/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
    return await future
  File "/workspace/kohya_ss/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
    result = context.run(func, *args)
  File "/workspace/kohya_ss/venv/lib/python3.10/site-packages/gradio/utils.py", line 846, in wrapper
    response = f(*args, **kwargs)
  File "/workspace/kohya_ss/kohya_gui/finetune_gui.py", line 874, in train_model
    lr_warmup_steps = lr_warmup / 100
TypeError: unsupported operand type(s) for /: 'str' and 'int'file:///workspace/kohya_ss/sd-scripts/finetune/merge_captions_to_metadata.pyfile:///workspace/kohya_ss/sd-scripts/finetune/merge_captions_to_metadata.py#23file:///workspace/kohya_ss/sd-scripts/finetune/merge_captions_to_metadata.pyfile:///workspace/kohya_ss/sd-scripts/finetune/merge_captions_to_metadata.py#29file:///workspace/kohya_ss/sd-scripts/finetune/merge_captions_to_metadata.pyfile:///workspace/kohya_ss/sd-scripts/finetune/merge_captions_to_metadata.py#31file:///workspace/kohya_ss/sd-scripts/finetune/merge_captions_to_metadata.pyfile:///workspace/kohya_ss/sd-scripts/finetune/merge_captions_to_metadata.py#36file:///workspace/kohya_ss/sd-scripts/finetune/merge_captions_to_metadata.pyfile:///workspace/kohya_ss/sd-scripts/finetune/merge_captions_to_metadata.py#53file:///workspace/kohya_ss/sd-scripts/finetune/merge_captions_to_metadata.pyfile:///workspace/kohya_ss/sd-scripts/finetune/merge_captions_to_metadata.py#55file:///workspace/kohya_ss/sd-scripts/finetune/prepare_buckets_latents.pyfile:///workspace/kohya_ss/sd-scripts/finetune/prepare_buckets_latents.py#73file:///workspace/kohya_ss/sd-scripts/finetune/prepare_buckets_latents.pyfile:///workspace/kohya_ss/sd-scripts/finetune/prepare_buckets_latents.py#76file:///workspace/kohya_ss/sd-scripts/library/model_util.pyfile:///workspace/kohya_ss/sd-scripts/library/model_util.py#1268

I am unable to solve this issue. 2024-10-29 09:08:55 INFO found 0 images.

Inside the folder /workspace/finetune/image/1_example/, there are various images along with their captions. Therefore, I set /workspace/finetune/image as the image path. However, no images are being detected. At first, I thought the issue might be due to the large number of images, so I reduced it to only 4 images with their respective captions, but still, only 0 images are recognized.

I have included only four images with a resolution of 1500x1000, and I specified this resolution in kohya_ss as well.

Currently, I am attempting fine-tuning in kohya_ss (not LoRA or Dreambooth) and have encountered the issue described above. How can I get the images to be properly recognized?

P.S. I am using RunPod. Could this be related to the issue?

P.S. 2: I have a second question. How many images can kohya_ss handle? If more than 10,000 images are needed, would it still work properly with Kohya?

image

kohya-ss commented 1 month ago

Unfortunately, FLUX.1 fine tuning does not yet support training using metadata json. Please wait a little longer.

The RunPod is probably not the culprit. Animagine XL 3.0 was trained with 1.2 million images using sd-scripts, so it can handle at least that many images.

Artemis1111 commented 1 month ago

Thank you for the quick response. I know you're working hard, but I have just a few more questions to ask.

  1. Does this mean that caption-based training is currently not possible? Is it not possible to conduct caption-based training not only in the fine-tuning tab but also in LoRA or other sections?

  2. What should I change in my setup to enable the training process?

kohya-ss commented 1 month ago
  • Does this mean that caption-based training is currently not possible? Is it not possible to conduct caption-based training not only in the fine-tuning tab but also in LoRA or other sections?

You can train it by writing captions in a text file with the same filename as the image but a different extension (*.caption for default).

2. What should I change in my setup to enable the training process?

I'm not familiar with GUI, but I think you can refer to the following page to write the dataset settings in .toml and then specify it from the GUI.

https://github.com/kohya-ss/sd-scripts/blob/main/docs/config_README-en.md

Artemis1111 commented 1 month ago

I attempted finetuning rather than using LoRA or Dreambooth, and it seems that this may have caused the error. After asking around, I learned that for Flux, being a distilled model, finetuning is currently only feasible with LoRA and Dreambooth.

My aim was not to introduce a new concept but to apply a style change across all prompts rather than focusing on specific keywords or concepts. Is it correct to say that finetuning to affect the style of the entire model's prompts is currently not possible?

kohya-ss commented 1 month ago

Sorry for the confusion. You can train with arbitrary captions in Dreambooth format, not with specific keywords. That means you can train with the same results whether you manage metadata in .json or captions in text files.