ShivamShrirao / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch
https://huggingface.co/docs/diffusers
Apache License 2.0
1.89k stars 505 forks source link

Adding support for Stable Diffusion V2.1 #162

Open bach777 opened 1 year ago

bach777 commented 1 year ago

Adding support for Stable Diffusion V2.1 please

rmac85 commented 1 year ago

It should work fine, but on colab use stabilityai/stable-diffusion-2-1-base, not just 2-1. The problem lies with the image resolution. You can't finetune a model on 768 resolution on colab as it is guaranteed to run out of memory. And if you train the 768 model on 512 resolution it gets mixed up. Best bet is to use the base 2-1 and also change the scheduler to one of the Eulers.

NeoAnthropocene commented 1 year ago

It pop ups CUDA error with the current setup when we try with the SD 2.1-base version. Actually since this is the 512 resolution one, that should work.

Traceback (most recent call last):
  File "/home/astroboy/github/shivamShrirao/diffusers/examples/dreambooth/train_dreambooth.py", line 822, in <module>
    main(args)
  File "/home/astroboy/github/shivamShrirao/diffusers/examples/dreambooth/train_dreambooth.py", line 794, in main
    optimizer.step()
  File "/home/astroboy/anaconda3/envs/diffusers/lib/python3.9/site-packages/accelerate/optimizer.py", line 134, in step
    self.scaler.step(self.optimizer, closure)
  File "/home/astroboy/anaconda3/envs/diffusers/lib/python3.9/site-packages/torch/cuda/amp/grad_scaler.py", line 341, in step
    retval = self._maybe_opt_step(optimizer, optimizer_state, *args, **kwargs)
  File "/home/astroboy/anaconda3/envs/diffusers/lib/python3.9/site-packages/torch/cuda/amp/grad_scaler.py", line 288, in _maybe_opt_step
    retval = optimizer.step(*args, **kwargs)
  File "/home/astroboy/anaconda3/envs/diffusers/lib/python3.9/site-packages/torch/optim/lr_scheduler.py", line 68, in wrapper
    return wrapped(*args, **kwargs)
  File "/home/astroboy/anaconda3/envs/diffusers/lib/python3.9/site-packages/torch/optim/optimizer.py", line 140, in wrapper
    out = func(*args, **kwargs)
  File "/home/astroboy/anaconda3/envs/diffusers/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/astroboy/anaconda3/envs/diffusers/lib/python3.9/site-packages/bitsandbytes/optim/optimizer.py", line 263, in step
    self.init_state(group, p, gindex, pindex)
  File "/home/astroboy/anaconda3/envs/diffusers/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/astroboy/anaconda3/envs/diffusers/lib/python3.9/site-packages/bitsandbytes/optim/optimizer.py", line 401, in init_state
    state["state2"] = torch.zeros_like(
RuntimeError: CUDA error: unknown error
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Steps:   0%|                                                                                   | 0/3600 [00:04<?, ?it/s]
Traceback (most recent call last):
  File "/home/astroboy/anaconda3/envs/diffusers/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/home/astroboy/anaconda3/envs/diffusers/lib/python3.9/site-packages/accelerate/commands/accelerate_cli.py", line 43, in main
    args.func(args)
  File "/home/astroboy/anaconda3/envs/diffusers/lib/python3.9/site-packages/accelerate/commands/launch.py", line 837, in launch_command
    simple_launcher(args)
  File "/home/astroboy/anaconda3/envs/diffusers/lib/python3.9/site-packages/accelerate/commands/launch.py", line 354, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/astroboy/anaconda3/envs/diffusers/bin/python', 'train_dreambooth.py', '--pretrained_model_name_or_path=stabilityai/stable-diffusion-2-1-base', '--output_dir=data/model/model_ozguraltay-SD20', '--revision=fp16', '--train_text_encoder', '--with_prior_preservation', '--prior_loss_weight=1.0', '--seed=2384271801', '--resolution=512', '--train_batch_size=1', '--mixed_precision=fp16', '--use_8bit_adam', '--gradient_accumulation_steps=1', '--gradient_checkpointing', '--learning_rate=1e-6', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--num_class_images=200', '--sample_batch_size=4', '--max_train_steps=3600', '--save_interval=1800', '--save_sample_prompt=a full body portrait photo of ozguraltay man in black tuxedo, professional studio photograph, 80mm, f1.8, clean focused on the face', '--concepts_list=concepts_list/concepts_list-ozguraltay_man.json']' returned non-zero exit status 1.

Is there anyone experiencing the same problem?

It works perfectly with the old SD 1.5 version. Will there be any updates on the requirements files for the SD 2.1-base version soon?

bach777 commented 1 year ago

@NeoAnthropocene Same problem!

bach777 commented 1 year ago

The following values were not passed to accelerate launch and had defaults used instead: --num_processes was set to a value of 1 --num_machines was set to a value of 1 --mixed_precision was set to a value of 'no' --num_cpu_threads_per_process was set to 1 to improve out-of-box performance To avoid this warning pass in values for each of the problematic parameters or run accelerate config. usage: train_dreambooth.py [-h] --pretrained_model_name_or_path PRETRAINED_MODEL_NAME_OR_PATH [--pretrained_vae_name_or_path PRETRAINED_VAE_NAME_OR_PATH] [--revision REVISION] [--tokenizer_name TOKENIZER_NAME] [--instance_data_dir INSTANCE_DATA_DIR] [--class_data_dir CLASS_DATA_DIR] [--instance_prompt INSTANCE_PROMPT] [--class_prompt CLASS_PROMPT] [--save_sample_prompt SAVE_SAMPLE_PROMPT] [--save_sample_negative_prompt SAVE_SAMPLE_NEGATIVE_PROMPT] [--n_save_sample N_SAVE_SAMPLE] [--save_guidance_scale SAVE_GUIDANCE_SCALE] [--save_infer_steps SAVE_INFER_STEPS] [--pad_tokens] [--with_prior_preservation] [--prior_loss_weight PRIOR_LOSS_WEIGHT] [--num_class_images NUM_CLASS_IMAGES] [--output_dir OUTPUT_DIR] [--seed SEED] [--resolution RESOLUTION] [--center_crop] [--train_text_encoder] [--train_batch_size TRAIN_BATCH_SIZE] [--sample_batch_size SAMPLE_BATCH_SIZE] [--num_train_epochs NUM_TRAIN_EPOCHS] [--max_train_steps MAX_TRAIN_STEPS] [--gradient_accumulation_steps GRADIENT_ACCUMULATION_STEPS] [--gradient_checkpointing] [--learning_rate LEARNING_RATE] [--scale_lr] [--lr_scheduler LR_SCHEDULER] [--lr_warmup_steps LR_WARMUP_STEPS] [--use_8bit_adam] [--adam_beta1 ADAM_BETA1] [--adam_beta2 ADAM_BETA2] [--adam_weight_decay ADAM_WEIGHT_DECAY] [--adam_epsilon ADAM_EPSILON] [--max_grad_norm MAX_GRAD_NORM] [--push_to_hub] [--hub_token HUB_TOKEN] [--hub_model_id HUB_MODEL_ID] [--logging_dir LOGGING_DIR] [--log_interval LOG_INTERVAL] [--save_interval SAVE_INTERVAL] [--save_min_steps SAVE_MIN_STEPS] [--mixed_precision {no,fp16,bf16}] [--not_cache_latents] [--hflip] [--local_rank LOCAL_RANK] [--concepts_list CONCEPTS_LIST] train_dreambooth.py: error: unrecognized arguments: SD.2-1 Traceback (most recent call last): File "/usr/local/bin/accelerate", line 8, in sys.exit(main()) File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/accelerate_cli.py", line 43, in main args.func(args) File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/launch.py", line 837, in launch_command simple_launcher(args) File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/launch.py", line 354, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['/usr/bin/python3', 'train_dreambooth.py', '--pretrained_model_name_or_path=stabilityai/stable-diffusion-2-1-base', '--pretrained_vae_name_or_path=stabilityai/sd-vae-ft-mse', '--output_dir=/content/drive/MyDrive/stable_diffusion_weights/adfqs', 'SD.2-1', '--revision=fp16', '--with_prior_preservation', '--prior_loss_weight=1.0', '--seed=1337', '--resolution=512', '--train_batch_size=1', '--train_text_encoder', '--mixed_precision=fp16', '--use_8bit_adam', '--gradient_accumulation_steps=1', '--learning_rate=1e-6', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--num_class_images=1200', '--sample_batch_size=4', '--max_train_steps=4650', '--save_interval=10000', '--save_sample_prompt=photo of adfqs man', '--concepts_list=concepts_list.json']' returned non-zero exit status 2.

raymondgp commented 1 year ago

Hi, I'm able to train the 512 base2-1 model, Colab, preservation on, 63 concept images, 1200 class images. The Samples come out fantastic from the 1k step checkpoint, the concept is uncannily similar (face) . I stopped the finetune at 3K steps as it started to overfit.

Now packaging to ckpt, bringing the model.yaml from the original base 2-1, and running in automatic1111 completely killed the finetune, no more semblance to the concept.

Any idea/suggestion?

githubUser01946 commented 1 year ago

Keep an eye on this as I would love to retrain my face with 2.1.

Will you be updating to work with 2.1 @ShivamShrirao ?

raymondgp commented 1 year ago

I kept experimenting. The pipeline/inference inside Shivam's colab works perfectly.

I can't tell if the ckpt conversion shreds the model or if it's the lack of a proper .yaml file defining the model so Automatic1111 can use it that breaks.

raymondgp commented 1 year ago

Update here, the training of a 2.1 model in Dreambooth, used in automatic1111 webui works.

It seems the Colab conversion to ckpt produces a bad stable diffusion model, or I'm doing something wrong either at runtime, or within my Google Drive.

Here is how I got a working model in Automatic1111:

  1. Download the trained Diffusers pipeline to my machine,
  2. Run this script to convert the diffuser model to ckpt:

https://github.com/lawfordp2017/diffusers/blob/main/scripts/convert_diffusers_to_original_stable_diffusion.py I end with a .ckpt.

Usage: convert_diffusers_to_original_stable_diffusion.py --model_path d://mypath//in//windows --model_checkpoint d://mypath//to//mymodel.ckpt

  1. Create a Yaml config file for inference based on this config file, rename to the same name you gave to the ckpt model:

https://github.com/Stability-AI/stablediffusion/blob/main/configs/stable-diffusion/v2-inference.yaml

  1. Save both model and inference yaml file in your models folder
  2. Load in Automatic1111 & Profit

I can load this successfully in Automatic1111 and works beautifully!!!

stormsson commented 1 year ago

It should work fine, but on colab use stabilityai/stable-diffusion-2-1-base, not just 2-1. The problem lies with the image resolution. You can't finetune a model on 768 resolution on colab as it is guaranteed to run out of memory. And if you train the 768 model on 512 resolution it gets mixed up. Best bet is to use the base 2-1 and also change the scheduler to one of the Eulers.

I should thank you, I didn't run out of memory, but still dreambooth wasn't working ; switching to -base model actually produced something.