SoumyaMB10 commented 3 months ago

Describe the bug

env: MODEL_NAME=runwayml/stable-diffusion-v1-5 env: INSTANCE_DIR=/content/drive/MyDrive/Newfolder env: HF_ENDPOINT=https://hf-mirror.com/ 2024-08-18 08:46:08.308678: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-08-18 08:46:08.328601: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-08-18 08:46:08.334721: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2024-08-18 08:46:09.559880: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT 08/18/2024 08:46:10 - INFO - main - Distributed environment: NO Num processes: 1 Process index: 0 Local process index: 0 Device: cuda

Mixed precision type: no

{'dynamic_thresholding_ratio', 'variance_type', 'clip_sample_range', 'sample_max_value', 'thresholding', 'timestep_spacing', 'prediction_type', 'rescale_betas_zero_snr'} was not found in config. Values will be initialized to default values. {'scaling_factor', 'latents_mean', 'use_quant_conv', 'latents_std', 'shift_factor', 'force_upcast', 'mid_block_add_attention', 'use_post_quant_conv'} was not found in config. Values will be initialized to default values. {'transformer_layers_per_block', 'mid_block_type', 'addition_time_embed_dim', 'encoder_hid_dim_type', 'time_cond_proj_dim', 'dual_cross_attention', 'projection_class_embeddings_input_dim', 'num_attention_heads', 'reverse_transformer_layers_per_block', 'time_embedding_act_fn', 'mid_block_only_cross_attention', 'addition_embed_type', 'use_linear_projection', 'num_class_embeds', 'encoder_hid_dim', 'only_cross_attention', 'resnet_time_scale_shift', 'time_embedding_dim', 'cross_attention_norm', 'time_embedding_type', 'addition_embed_type_num_heads', 'conv_out_kernel', 'attention_type', 'dropout', 'class_embeddings_concat', 'timestep_post_act', 'class_embed_type', 'conv_in_kernel', 'upcast_attention', 'resnet_skip_time_act', 'resnet_out_scale_factor'} was not found in config. Values will be initialized to default values. Resolving data files: 100% 18/18 [00:00<00:00, 150094.38it/s] Generating train split: 9 examples [00:00, 377.47 examples/s] Traceback (most recent call last): File "/content/train_text_to_image_lora.py", line 979, in main() File "/content/train_text_to_image_lora.py", line 625, in main raise ValueError( ValueError: --image_column' value 'image' needs to be one of: text Traceback (most recent call last): File "/usr/local/bin/accelerate", line 8, in sys.exit(main()) File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/accelerate_cli.py", line 48, in main args.func(args) File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 1097, in launch_command simple_launcher(args) File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 703, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['/usr/bin/python3', 'train_text_to_image_lora.py', '--pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5', '--dataset_name=/content/drive/MyDrive/Newfolder', '--resolution=512', '--train_batch_size=1', '--gradient_accumulation_steps=1', '--checkpointing_steps=100', '--learning_rate=1e-4', '--report_to=wandb', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--max_train_steps=500', '--validation_prompt=forward trajectory', '--validation_epochs=50', '--seed=0', '--push_to_hub']' returned non-zero exit status 1.

i also had a dependency issue and think this error is related to that.

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. cudf-cu12 24.4.1 requires pyarrow<15.0.0a0,>=14.0.1, but you have pyarrow 17.0.0 which is incompatible. ibis-framework 8.0.0 requires pyarrow<16,>=2, but you have pyarrow 17.0.0 which is incompatible. ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. datasets 2.21.0 requires pyarrow>=15.0.0, but you have pyarrow 14.0.1 which is incompatible.

pyarrow has conflicting versions for cudf-cu12 24.4.1 ibis-framework 8.0.0 datasets 2.21.0

Reproduction

!pip install git+https://github.com/huggingface/diffusers

!pip install accelerate

!pip install -r https://raw.githubusercontent.com/huggingface/diffusers/main/examples/text_to_image/requirements.txt !pip install pyarrow==14.0.1

!accelerate config default

%env MODEL_NAME=runwayml/stable-diffusion-v1-5 %env INSTANCE_DIR=/content/drive/MyDrive/Newfolder %env HF_ENDPOINT=https://hf-mirror.com

!accelerate launch train_text_to_image_lora.py \ --pretrained_model_name_or_path=$MODEL_NAME \ --dataset_name=$INSTANCE_DIR \ --resolution=512 \ --train_batch_size=1 \ --gradient_accumulation_steps=1 \ --checkpointing_steps=100 \ --learning_rate=1e-4 \ --report_to="wandb" \ --lr_scheduler="constant" \ --lr_warmup_steps=0 \ --max_train_steps=500 \ --validation_prompt="forward trajectory" \ --validation_epochs=50 \ --seed="0" \ --push_to_hub

Logs

No response

System Info

🤗 Diffusers version: 0.31.0.dev0
Platform: Linux-6.1.85+-x86_64-with-glibc2.35
Running on Google Colab?: Yes
Python version: 3.10.12
PyTorch version (GPU?): 2.3.1+cu121 (True)
Flax version (CPU?/GPU?/TPU?): 0.8.4 (gpu)
Jax version: 0.4.26
JaxLib version: 0.4.26
Huggingface_hub version: 0.23.5
Transformers version: 4.42.4
Accelerate version: 0.32.1
PEFT version: 0.7.0
Bitsandbytes version: not installed
Safetensors version: 0.4.4
xFormers version: not installed
Accelerator: Tesla T4, 15360 MiB
Using GPU in script?:
Using distributed or parallel set-up in script?:

Who can help?

@sayakpaul

to add my dataset folder has image.png and image.txt

Screenshot 2024-08-18 095610

sayakpaul commented 3 months ago

Can you provide a more minimal reproducible snippet?

SoumyaMB10 commented 3 months ago

%env MODEL_NAME=runwayml/stable-diffusion-v1-5 %env INSTANCE_DIR=/content/drive/MyDrive/Newfolder %env HF_ENDPOINT=https://hf-mirror.com/

!accelerate launch train_text_to_image_lora.py --pretrained_model_name_or_path=$MODEL_NAME --dataset_name=$INSTANCE_DIR --resolution=512 --train_batch_size=1 --gradient_accumulation_steps=1 --checkpointing_steps=100 --learning_rate=1e-4 --report_to="wandb" --lr_scheduler="constant" --lr_warmup_steps=0 --max_train_steps=500 --validation_prompt="forward trajectory" --validation_epochs=50 --seed="0" --push_to_hub

SoumyaMB10 commented 3 months ago

the error occurred while running above snippet
error - ValueError: --image_column' value 'image' needs to be one of: text

sayakpaul commented 3 months ago

Above is a script not a snippet.

SoumyaMB10 commented 3 months ago

%env MODEL_NAME=runwayml/stable-diffusion-v1-5 %env INSTANCE_DIR=/content/drive/MyDrive/Newfolder %env HF_ENDPOINT=https://hf-mirror.com

!accelerate launch train_text_to_image_lora.py \ --pretrained_model_name_or_path=$MODEL_NAME \ --dataset_name=$INSTANCE_DIR \ --push_to_hub

error Traceback (most recent call last): File "/content/train_text_to_image_lora.py", line 979, in main() File "/content/train_text_to_image_lora.py", line 625, in main raise ValueError( ValueError: --image_column' value 'image' needs to be one of: text

SoumyaMB10 commented 3 months ago

@sayakpaul I have added the minimal snippet required to reproduce the error, please advice

sayakpaul commented 3 months ago

Thay is not a minimal code snippet, that is a training command. By a minimal code snippet I meant something like following:

from datasets import load_dataset 

dataset = load_dataset(my_directory, metadata="...")

SoumyaMB10 commented 3 months ago

i did not use load_dataset, called dataset directly in the training command.
if I run the above minimal code snippet. here is the output

SoumyaMB10 commented 3 months ago

any comments on this @sayakpaul ?

mj-x commented 2 months ago

Maybe your dataset has columns named as "image", "label" it does not read the the metadata you provided I see from this:https://github.com/huggingface/diffusers/issues/6445

github-actions[bot] commented 2 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

mj-x commented 2 months ago

您发给我的信件已经收到，非常感谢您的来信，我将尽快给您回复。This is an automatic reply, confirming that your e-mail was received.Thank you.

github-actions[bot] commented 1 month ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

huggingface / diffusers

ValueError: --image_column' value 'image' needs to be one of: text #9210

Describe the bug

Reproduction

!pip install accelerate

Logs

System Info

Who can help?