kohya-ss / sd-scripts

Apache License 2.0
5.24k stars 867 forks source link

An error with SDXL finetuning: AttributeError: module 'library.train_util' has no attribute 'LossRecorder' #932

Closed a-l-e-x-d-s-9 closed 1 year ago

a-l-e-x-d-s-9 commented 1 year ago

I updated to the latest version and tried to run SDXL finetuning. But I get an error:

  File "/workspace/kohya_ss/sdxl_train.py", line 775, in <module>
    train(args)
  File "/workspace/kohya_ss/sdxl_train.py", line 464, in train
    loss_recorder = train_util.LossRecorder()
AttributeError: module 'library.train_util' has no attribute 'LossRecorder'

The command that I used to run the training with:

python  /workspace/kohya_ss/sdxl_train.py --pretrained_model_name_or_path=/workspace/model/sd_xl_base_1.0_0.9vae.safetensors --optimizer_type=Prodigy --learning_rate=1 --max_grad_norm=1.0 --optimizer_args='decouple=True weight_decay=0.01 betas=0.9,0.99 use_bias_correction=True' --lr_scheduler=constant --output_dir=/workspace/output/ --output_name=babes_1.0_xl_tr_03a --save_precision=bf16 --save_every_n_epochs=2 --save_every_n_steps=0 --save_last_n_steps=0 --save_last_n_steps_state=0 --resume='' --train_batch_size=8 --max_token_length=225 --xformers --max_train_epochs=20 --max_data_loader_n_workers=64 --seed=916130031 --gradient_checkpointing --gradient_accumulation_steps=5 --mixed_precision=bf16 --clip_skip=1 --logging_dir='' --wandb_api_key=cdbb2f1d7d330e732464b076e0304c78b809c966 --noise_offset=0.15 --multires_noise_iterations=80 --multires_noise_discount=0 --adaptive_noise_scale=0.075 --min_timestep=0 --max_timestep=100000000 --sample_every_n_steps=0 --sample_every_n_epochs=1 --sample_prompts=/workspace/output/sample/prompt.txt --sample_sampler=euler_a --train_data_dir=/workspace/input/dataset --shuffle_caption --caption_extension=.txt --keep_tokens=0 --flip_aug --resolution=1024 --cache_latents --vae_batch_size=0 --cache_latents_to_disk --min_bucket_reso=128 --max_bucket_reso=2048 --bucket_reso_steps=64 --bucket_no_upscale --caption_dropout_rate=0 --caption_dropout_every_n_epochs=0 --dataset_repeats=1 --save_model_as=safetensors --dataset_config=/workspace/dataset_settings.toml

Log: LossRecorder_error.txt

I made a small change to train_util.py to make optimizer_args work:

    optimizer_kwargs = {}
    if args.optimizer_args is not None and len(args.optimizer_args) > 0:
        print(f"args.optimizer_args: {args.optimizer_args}")
        optimizer_args = args.optimizer_args

        if isinstance(optimizer_args, str):
            optimizer_args = [optimizer_args]

        for current_arg_string in optimizer_args:

            pattern = re.compile(r'(\w+)=([\w\.\,]+)')
            args_opt = pattern.findall(current_arg_string)

            for key, value in args_opt:
                print(f"key: {key}, value: {value}")
                try:
                    # First, attempt to evaluate the value as a literal (e.g., "True" becomes True)
                    converted_value = ast.literal_eval(value)
                except (ValueError, SyntaxError):
                    # If literal_eval fails, it might be a string representation of a list of floats
                    if ',' in value:
                        try:
                            # Attempt to split the string by commas and convert each part to a float
                            converted_value = [ast.literal_eval(v.strip()) for v in value.split(',')]
                        except ValueError:
                            converted_value = value
                    else:
                        converted_value = value

                optimizer_kwargs[key] = converted_value

Not sure what's the problem, any suggestions?

a-l-e-x-d-s-9 commented 1 year ago

I haven't updated the script that I changed.