Don't understand why it always shows an error saying 'No data found. Please verify arguments.

xddun commented 5 months ago

error is:

                INFO     [Dataset 0]                                                                                                         config_util.py:571
                INFO     loading image sizes.                                                                                                 train_util.py:853

0it [00:00, ?it/s] INFO make buckets train_util.py:859 WARNING min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is set, because bucket reso is defined by image train_util.py:876 size automatically / bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計算されるため、min_bucket_resoとmax_bucket_r esoは無視されます INFO number of images (including repeats) / 各bucketの画像枚数（繰り返し回数を含む） train_util.py:905 /usr/local/lib/python3.10/dist-packages/numpy/core/fromnumeric.py:3504: RuntimeWarning: Mean of empty slice. return _methods._mean(a, axis=axis, dtype=dtype, /usr/local/lib/python3.10/dist-packages/numpy/core/_methods.py:129: RuntimeWarning: invalid value encountered in scalar divide ret = ret.dtype.type(ret / rcount) INFO mean ar error (without repeats): nan train_util.py:915 ERROR No data found. Please verify arguments (train_data_dir must be the parent of folders with images) / train_network.py:212 画像がありません。引数指定を確認してください（train_data_dirには画像があるフォルダではなく、画像があるフォルダの親フォルダを指定する必要があります）

My instruction is：

accelerate launch --num_cpu_threads_per_process=2 "./sdxl_train_network.py" \
--enable_bucket \
--network_train_unet_only \
--learning_rate="0.0001" \
--unet_lr=0.0001 \
--lr_scheduler="constant" \
--optimizer_type="AdamW8bit" \
--train_batch_size="1" \
--save_every_n_epochs="1" \
--max_train_epochs=100 \
--resolution="1024,1024" \
--network_dim=32 \
--network_alpha="32" \
--pretrained_model_name_or_path="/workspace/stable-diffusion-webui/models/Stable-diffusion/sd_xl_base_1.0.safetensors" \
--train_data_dir="/workspace/yifei/"  \
--output_dir="/workspace/yifei_model/" \
--logging_dir="/workspace/train-log/" \
--output_name="mogu" \
--save_model_as=safetensors \
--network_module=networks.lora \
--lr_scheduler_num_cycles="3" \
--cache_text_encoder_outputs \
--caption_extension=".txt" \
--min_bucket_reso=256 \
--max_bucket_reso=1024 \
--no_half_vae \
--full_bf16 \
--mixed_precision="bf16" \
--save_precision="bf16" \
--cache_latents \
--cache_latents_to_disk \
--max_data_loader_n_workers="0" \
--bucket_reso_steps=64 \
--xformers \
--clip_skip=2 \
--noise_offset=0.0357 \
--gradient_checkpointing \
--bucket_no_upscale \
--sample_every_n_epochs="1"   \
--sample_sampler=euler_a   \
--sample_prompts="/workspace/kohya_ss/sd-scripts/prompt.txt"

Path /workspace/yifei/ :

There are 23 images (each with width and height greater than 1024), and their corresponding .txt files are also here, generated by blip2.

I don't understand what's wrong, this is all log:


(venv) root@d535cd351e69:/workspace/kohya_ss/sd-scripts# accelerate launch --num_cpu_threads_per_process=2 "./sdxl_train_network.py" \
--enable_bucket \
--network_train_unet_only \
--learning_rate="0.0001" \
--unet_lr=0.0001 \
--lr_scheduler="constant" \
--optimizer_type="AdamW8bit" \
--train_batch_size="1" \
--save_every_n_epochs="1" \
--max_train_epochs=100 \
--resolution="1024,1024" \
--network_dim=32 \
--network_alpha="32" \
--pretrained_model_name_or_path="/workspace/stable-diffusion-webui/models/Stable-diffusion/sd_xl_base_1.0.safetensors" \
--train_data_dir="/workspace/yifei/"  \
--output_dir="/workspace/yifei_model/" \
--logging_dir="/workspace/train-log/" \
--output_name="mogu" \
--save_model_as=safetensors \
--network_module=networks.lora \
--lr_scheduler_num_cycles="3" \
--cache_text_encoder_outputs \
--caption_extension=".txt" \
--min_bucket_reso=256 \
--max_bucket_reso=1024 \
--no_half_vae \
--full_bf16 \
--mixed_precision="bf16" \
--save_precision="bf16" \
--cache_latents \
--cache_latents_to_disk \
--max_data_loader_n_workers="0" \
--bucket_reso_steps=64 \
--xformers \
--clip_skip=2 \
--noise_offset=0.0357 \
--gradient_checkpointing \
--bucket_no_upscale \
--sample_every_n_epochs="1"   \
--sample_sampler=euler_a   \
--sample_prompts="/workspace/kohya_ss/sd-scripts/prompt.txt"
2024-04-25 09:15:47.910941: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-04-25 09:15:47.943903: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-04-25 09:15:47.943929: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-04-25 09:15:47.944654: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-04-25 09:15:47.949114: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-04-25 09:15:48.631106: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-04-25 09:15:49 INFO     prepare tokenizers                                                                                              sdxl_train_util.py:134
2024-04-25 09:15:55 INFO     Using DreamBooth method.                                                                                          train_network.py:172
                    INFO     prepare images.                                                                                                     train_util.py:1572
                    INFO     0 train images with repeating.                                                                                      train_util.py:1613
                    INFO     0 reg images.                                                                                                       train_util.py:1616
                    WARNING  no regularization images / 正則化画像が見つかりませんでした                                                         train_util.py:1621
                    INFO     [Dataset 0]                                                                                                         config_util.py:565
                               batch_size: 1
                               resolution: (1024, 1024)
                               enable_bucket: False
                               network_multiplier: 1.0

                    INFO     [Dataset 0]                                                                                                         config_util.py:571
                    INFO     loading image sizes.                                                                                                 train_util.py:853
0it [00:00, ?it/s]
                    INFO     make buckets                                                                                                         train_util.py:859
                    WARNING  min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is set, because bucket reso is defined by image train_util.py:876
                             size automatically /
                             bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計算されるため、min_bucket_resoとmax_bucket_r
                             esoは無視されます
                    INFO     number of images (including repeats) / 各bucketの画像枚数（繰り返し回数を含む）                                      train_util.py:905
/usr/local/lib/python3.10/dist-packages/numpy/core/fromnumeric.py:3504: RuntimeWarning: Mean of empty slice.
  return _methods._mean(a, axis=axis, dtype=dtype,
/usr/local/lib/python3.10/dist-packages/numpy/core/_methods.py:129: RuntimeWarning: invalid value encountered in scalar divide
  ret = ret.dtype.type(ret / rcount)
                    INFO     mean ar error (without repeats): nan                                                                                 train_util.py:915
                    ERROR    No data found. Please verify arguments (train_data_dir must be the parent of folders with images) /               train_network.py:212
                             画像がありません。引数指定を確認してください（train_data_dirには画像があるフォルダではなく、画像があるフォルダの
                             親フォルダを指定する必要があります）

xddun commented 5 months ago

--sample_prompts="/workspace/kohya_ss/sd-scripts/prompt.txt":

DKnight54 commented 5 months ago

--train_data_dir="/workspace/yifei/"

When passing in a train data dir using the command line, SD-Scripts expect a specfic folder structure. Specifically, the images and captions have to be in a subfolder with a specific naming format.

You will need to move your images and prompts to a subfolder in this format: "numofrepeats_subject class", so assuming you want 5 repeats, would probably be "5_yifei woman"

So the folder structure should be /workspace/yifei/5_yifei woman, with the arguement --train_data_dir="/workspace/yifei/"

xddun commented 5 months ago

Nice answer, thank you!

kohya-ss / sd-scripts

Don't understand why it always shows an error saying 'No data found. Please verify arguments. #1294