bmaltais / kohya_ss

Apache License 2.0
9.54k stars 1.23k forks source link

TypeError: ClusterConfig.__init__() got an unexpected keyword argument 'debug' #1554

Closed Mostraet closed 8 months ago

Mostraet commented 1 year ago

Trying to train a LoRA on Linux Mint with nVidia GPU and 16 gigs of RAM, and I get the following error about debugging. I've run accelerate config, but that didn't help. Please help.

Full error log below.

22:56:32-223329 INFO     Saving training config to                              
                         /home/username/kohya_ss/output/model/trainingset_v01_20
                         230928-225632.json...                                  
22:56:32-224553 INFO     accelerate launch --num_cpu_threads_per_process=2      
                         "./sdxl_train_network.py" --enable_bucket              
                         --min_bucket_reso=256 --max_bucket_reso=1024           
                         --pretrained_model_name_or_path="/home/username/stable-dif
                         fusion-webui/models/Stable-diffusion/nightvisionXL_v074
                         3.safetensors"                                         
                         --train_data_dir="/home/username/kohya_ss/output/img"     
                         --resolution="1024,1024"                               
                         --output_dir="/home/username/kohya_ss/output/model"       
                         --logging_dir="/home/username/kohya_ss/output/log"        
                         --network_alpha="64" --training_comment="3 repeats.    
                         More info: https://civitai.com/articles/1771"          
                         --save_model_as=safetensors                            
                         --network_module=networks.lora --text_encoder_lr=5e-05 
                         --unet_lr=0.0001 --network_dim=256                     
                         --output_name="trainingset_v01"                     
                         --lr_scheduler_num_cycles="5" --no_half_vae            
                         --learning_rate="0.0001"                               
                         --lr_scheduler="cosine_with_restarts"                  
                         --train_batch_size="1" --max_train_steps="5700"        
                         --save_every_n_epochs="1" --mixed_precision="fp16"     
                         --save_precision="fp16" --caption_extension=".txt"     
                         --cache_latents --cache_latents_to_disk                
                         --optimizer_type="AdamW" --max_train_epochs=5          
                         --max_data_loader_n_workers="0"                        
                         --caption_dropout_rate="0.05" --bucket_reso_steps=64   
                         --min_snr_gamma=5 --gradient_checkpointing --xformers  
                         --bucket_no_upscale --noise_offset=0.0                 
                         --sample_sampler=k_dpm_2                               
                         --sample_prompts="/home/username/kohya_ss/output/model/sam
                         ple/prompt.txt" --sample_every_n_epochs="1"            
2023-09-28 22:56:33.478902: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-09-28 22:56:33.964685: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /home/username/kohya_ss/venv/bin/accelerate:8 in <module>                       │
│                                                                              │
│   5 from accelerate.commands.accelerate_cli import main                      │
│   6 if __name__ == '__main__':                                               │
│   7 │   sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])     │
│ ❱ 8 │   sys.exit(main())                                                     │
│   9                                                                          │
│                                                                              │
│ /home/username/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/a │
│ ccelerate_cli.py:45 in main                                                  │
│                                                                              │
│   42 │   │   exit(1)                                                         │
│   43 │                                                                       │
│   44 │   # Run                                                               │
│ ❱ 45 │   args.func(args)                                                     │
│   46                                                                         │
│   47                                                                         │
│   48 if __name__ == "__main__":                                              │
│                                                                              │
│ /home/username/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/l │
│ aunch.py:895 in launch_command                                               │
│                                                                              │
│   892                                                                        │
│   893                                                                        │
│   894 def launch_command(args):                                              │
│ ❱ 895 │   args, defaults, mp_from_config_flag = _validate_launch_command(arg │
│   896 │                                                                      │
│   897 │   # Use the proper launcher                                          │
│   898 │   if args.use_deepspeed and not args.cpu:                            │
│                                                                              │
│ /home/username/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/l │
│ aunch.py:782 in _validate_launch_command                                     │
│                                                                              │
│   779 │   mp_from_config_flag = False                                        │
│   780 │   # Get the default from the config file.                            │
│   781 │   if args.config_file is not None or os.path.isfile(default_config_f │
│ ❱ 782 │   │   defaults = load_config_from_file(args.config_file)             │
│   783 │   │   if (                                                           │
│   784 │   │   │   not args.multi_gpu                                         │
│   785 │   │   │   and not args.tpu                                           │
│                                                                              │
│ /home/username/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/c │
│ onfig/config_args.py:72 in load_config_from_file                             │
│                                                                              │
│    69 │   │   │   │   config_class = ClusterConfig                           │
│    70 │   │   │   else:                                                      │
│    71 │   │   │   │   config_class = SageMakerConfig                         │
│ ❱  72 │   │   │   return config_class.from_yaml_file(yaml_file=config_file)  │
│    73                                                                        │
│    74                                                                        │
│    75 @dataclass                                                             │
│                                                                              │
│ /home/username/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/c │
│ onfig/config_args.py:135 in from_yaml_file                                   │
│                                                                              │
│   132 │   │   │   config_dict["dynamo_config"] = {} if dynamo_backend == "NO │
│   133 │   │   if "use_cpu" not in config_dict:                               │
│   134 │   │   │   config_dict["use_cpu"] = False                             │
│ ❱ 135 │   │   return cls(**config_dict)                                      │
│   136 │                                                                      │
│   137 │   def to_yaml_file(self, yaml_file):                                 │
│   138 │   │   with open(yaml_file, "w", encoding="utf-8") as f:              │
╰──────────────────────────────────────────────────────────────────────────────╯
TypeError: ClusterConfig.__init__() got an unexpected keyword argument 'debug'
^CKeyboard interruption in main thread... closing server.
Liyu96sc commented 1 year ago

I have also encountered the same problem. Have you resolved it?

Mostraet commented 1 year ago

I have also encountered the same problem. Have you resolved it?

No, tried again today and couldn't get it to work, same error.

TeKett commented 1 year ago

Got the same issue on 21.8.8, it just randomly stopped working, and throwing this error. I downloaded a new setup of kohya with version 22.0.1 which don't throw this error. I still would like to get 21.8.8 working to double check if its supposed to use 4 times as much vram when training XL then 1.5.

RayHell commented 1 year ago

It's the latest windows update. I have 2 systems with the exact same config. They are used for SDXL training for months now under the 21.8.5 commit. One of my system got the new windows update today and I got this error right after the other one didn't receive the update and is still working fine.

tuwonga commented 1 year ago

I have the same issue.

TeKett commented 1 year ago

Getting a new setup of kohya don't show the error, but my existing setup still had issues, so i redownloaded python and the error went away. No clue what could have gone wrong.

1099271 commented 11 months ago

I solved it!

The reason for this problem is because the path to default_config.yaml is incorrect.

I found that the default_config.yaml referenced here is in C:{Users}\AppData\Local\huggingface\accelerate\default_config .yaml, (This is my case,You guys can see for yourselves where your default_config.yaml file is at)

open this file, comment out the debug parameter, then the problem is solved.

image

@Mostraet @tuwonga @Liyu96sc

tuwonga commented 11 months ago

thanks a lot! I got a new fresh setup and solved but this fix is awesome!

dantruonghtno1 commented 9 months ago

I tried pip install --upgrade accelerate and it worked

Fanshaoliu commented 6 months ago

I solved it!

The reason for this problem is because the path to default_config.yaml is incorrect.

I found that the default_config.yaml referenced here is in C:{Users}\AppData\Local\huggingface\accelerate\default_config .yaml, (This is my case,You guys can see for yourselves where your default_config.yaml file is at)

open this file, comment out the debug parameter, then the problem is solved.

image

@Mostraet @tuwonga @Liyu96sc

Thanks for your method! Can u provide some insights about how u find the method? It is not straightforward