ShivamShrirao / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch
https://huggingface.co/docs/diffusers
Apache License 2.0
1.88k stars 506 forks source link

Stable Diffusion v1.5 #50

Open redwraith2 opened 1 year ago

redwraith2 commented 1 year ago

Since stable diffusion 1.5 is out, I was hoping to use it with the imagic colab. Could you upgrade diffusers to use the new model instead of the old one?

NeoAnthropocene commented 1 year ago

I was searching the docs about how to update the current training model version. IMHO, this should be the utmost important request currently. There is considerable text prompt accuracy in the SD 1.5 version.

ShivamShrirao commented 1 year ago

Changing MODEL_NAME to runwayml/stable-diffusion-v1-5 should work.

InB4DevOps commented 1 year ago

Changing MODEL_NAME to runwayml/stable-diffusion-v1-5 should work.

I did a 20k step training last night and used this as model name. Worked like a charme. BTW: I tried to use the --train_text_encoder \ parameter and it worked on my 12GB 3060. So maybe you don't need 13GB after all? 🤷‍♂️

NeoAnthropocene commented 1 year ago

Changing MODEL_NAME to runwayml/stable-diffusion-v1-5 should work.

Thanks! that worked :)

NeoAnthropocene commented 1 year ago

Changing MODEL_NAME to runwayml/stable-diffusion-v1-5 should work.

I did a 20k step training last night and used this as model name. Worked like a charme. BTW: I tried to use the --train_text_encoder \ parameter and it worked on my 12GB 3060. So maybe you don't need 13GB after all? 🤷‍♂️

My poor 11Gb 2080ti also handled the "text encoder" with the help of "8bit_adam" and adding "gradient_accumulation_steps=1" code.

But the thing is; adding a "text encoder" command makes the faces more powerful. I mean you can't ever create an illustration with the trained model. It always creates a very accurate photo of the trained face. I dunno if that happened to you. I trained another model with the same class without the "text encoder" command, and it is working as it should.

InB4DevOps commented 1 year ago

Changing MODEL_NAME to runwayml/stable-diffusion-v1-5 should work.

I did a 20k step training last night and used this as model name. Worked like a charme. BTW: I tried to use the --train_text_encoder \ parameter and it worked on my 12GB 3060. So maybe you don't need 13GB after all? 🤷‍♂️

My poor 11Gb 2080ti also handled the "text encoder" with the help of "8bit_adam" and adding "gradient_accumulation_steps=1" code.

But the thing is; adding a "text encoder" command makes the faces more powerful. I mean you can't ever create an illustration with the trained model. It always creates a very accurate photo of the trained face. I dunno if that happened to you. I trained another model with the same class without the "text encoder" command, and it is working as it should.

I had best results with a pretty low CFG scale (3.2). At 7 and higher the faces were "too much". To find the "best" cfg scale I made an X/Y plot with X-axis cfg scale "1-15 (+0.1)" and used "30" for steps on y-axis.

InB4DevOps commented 1 year ago

It always creates a very accurate photo of the trained face.

When this happens you need to either reduce the weight of your instance prompt eg [sks woman:0.8] (fiddle with the number) or place it further behind other words in the prompt.

ShivamShrirao commented 1 year ago

@NeoAnthropocene how many steps and at what lr did u train it for ? Seems yours got overfit.

NeoAnthropocene commented 1 year ago

@NeoAnthropocene how many steps and at what lr did u train it for ? Seems yours got overfit.

Hey Shivam, thx for the reply! 2020 steps and lr scheduler have been set to "constant". I added 36 images to the training roster. The results are normal without the train text encoder. It just seems sus to me 🤔

For more detail here u can see the training file commands below.

export LD_LIBRARY_PATH=/usr/lib/wsl/lib:$LD_LIBRARY_PATH
export MODEL_NAME="runwayml/stable-diffusion-v1-5"
export INSTANCE_DIR="training_ozguraltay"
export CLASS_DIR="classes_ozguraltay"
export OUTPUT_DIR="model_ozguraltay-SD15"

accelerate launch train_dreambooth.py \
  --pretrained_model_name_or_path=$MODEL_NAME \
  --instance_data_dir=$INSTANCE_DIR \
  --class_data_dir=$CLASS_DIR \
  --output_dir=$OUTPUT_DIR \
  --train_text_encoder \
  --with_prior_preservation --prior_loss_weight=1.0 \
  --instance_prompt="a picture of ozguraltay man" \
  --class_prompt="a picture of man" \
  --resolution=512 \
  --train_batch_size=1 \
  --mixed_precision="fp16" \
  --gradient_accumulation_steps=1 --gradient_checkpointing \
  --use_8bit_adam \
  --learning_rate=5e-6 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --num_class_images=50 \
  --max_train_steps=2020
ShivamShrirao commented 1 year ago

@NeoAnthropocene The training with text encoder requires different settings. Your train steps and learning rate are way too high, Try --learning_rate=1e-6 and --max_train_steps=800.

NeoAnthropocene commented 1 year ago

@NeoAnthropocene The training with text encoder requires different settings. Your train steps and learning rate are way too high, Try --learning_rate=1e-6 and --max_train_steps=800.

Hmm good to know that. I'll give it a shot and let you know about the result.

NeoAnthropocene commented 1 year ago

@NeoAnthropocene The training with text encoder requires different settings. Your train steps and learning rate are way too high, Try --learning_rate=1e-6 and --max_train_steps=800.

@ShivamShrirao You can see my test results below. Frankly speaking, the over-saturated model (trained model 4) gives me the best version of myself. But As you can see it puts my face everywhere in the images that I created.

@InB4DevOps Thanks for the advice but, decreasing the CFG scale rate is not helpful because the engine renders unrelated images. Maybe you can use it on "in paint" only just to change your faces in photos. But it is not useful for illustrations.

Really appreciate your kind help guys. If you know any other useful trick please let me know.

training-test

ShivamShrirao commented 1 year ago

@NeoAnthropocene You can see that there is a lot of values to cover between 800 and 2020 steps and 1e-6 to 5e-6 lr. Your Goldilocks zone will lie somewhere in between. Experiment.

NeoAnthropocene commented 1 year ago

@NeoAnthropocene You can see that there is a lot of values to cover between 800 and 2020 steps and 1e-6 to 5e-6 lr. Your Goldilocks zone will lie somewhere in between. Experiment.

@ShivamShrirao Yes, you are definitely right. It should be around 1500 I guess, we'll see. I hope my experiment can help other people too.

InB4DevOps commented 1 year ago

@NeoAnthropocene see these tips from @joepenna

https://github.com/JoePenna/Dreambooth-Stable-Diffusion#oh-no-youre-not-getting-good-generations

Razunter commented 1 year ago

Changing MODEL_NAME to runwayml/stable-diffusion-v1-5 should work.

I'm getting Not Found for url: https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/config.json at the end

InB4DevOps commented 1 year ago

@Razunter you need to visit https://huggingface.co/runwayml/stable-diffusion-v1-5 and accept the license.

Razunter commented 1 year ago

@InB4DevOps I did, I'm getting this at the end of processing, after downloading and steps. WSL

JoshuaAFerguson commented 1 year ago

How do I fix this issue?

404 Client Error: Not Found for url: https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/diffusion_pytorch_model.bin

InB4DevOps commented 1 year ago

It seems @ShivamShrirao needs to sync this fork with the upstream for this to work properly. I'm also having these problems lately.

For now if you want to use 1.5 you need to do this: (should also work with 1.4)

if this still does not work go to /home/username/.cache/huggingface and delete the diffusers folder, then re-run the convert script and try a new training

toddka commented 1 year ago

@InB4DevOps is that script supposed to output a config.json and diffusion_pytorch_model.bin to the MODEL_NAME root? I'm only getting the following directories:

scheduler  text_encoder  tokenizer  unet  vae

This ends up giving an error when training due to the missing .bin.

Omegastick commented 1 year ago

@InB4DevOps is that script supposed to output a config.json and diffusion_pytorch_model.bin to the MODEL_NAME root? I'm only getting the following directories:

scheduler  text_encoder  tokenizer  unet  vae

This ends up giving an error when training due to the missing .bin.

I'm also seeing this. I wonder if something broke today.

InB4DevOps commented 1 year ago

@toddka @Omegastick Please go to /home/username/.cache/huggingface and delete the diffusers folder. Then re-run the convert command above.

petekay commented 1 year ago

@InB4DevOps:

I am stil getting the error, so here is my shell-script:

echo "########################################"
echo "delete cache"
cd ~/.cache/huggingface
rm -r diffusers
read -t 1 -p "Cache deleted"

echo ""
echo "########################################"
echo "if exsits, delete the transformed model"
cd ~/github/models/stable-diff/v1-4/
rm -r transformed
mkdir transformed
read -t 1 -p "(Re)-created transformed model "

echo ""
echo "########################################"
echo "go to ShivamShrirao scripts folder" 
cd ~/github/diffusers/scripts 
git remote -v 
read -t 1 -p "Check folder and repo "

echo ""
echo "########################################"
echo "convert_original_stable_diffusion_to_diffusers" 
python ./convert_original_stable_diffusion_to_diffusers.py \
  --checkpoint_path ~/github/models/stable-diff/v1-4/sd-v1-4-full-ema.ckpt \
  --dump_path ~/github/models/stable-diff/v1-4/transformed \
  --scheduler_type lms
read -t 1 -p "Model transformed"

echo ""
echo "########################################"
echo "Run Dreambooth"
cd ~/github/diffusers/examples/dreambooth
cat my_train3.sh
./my_train3.sh

and this still produces the OSError:

The following values were not passed to `accelerate launch` and had defaults used instead:
        `--num_cpu_threads_per_process` was set to `8` to improve out-of-box performance
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
Traceback (most recent call last):
  File "/home/MY_USERNAME/github/diffusers/examples/dreambooth/train_dreambooth.py", line 765, in <module>
    main()
  File "/home/MY_USERNAME/github/diffusers/examples/dreambooth/train_dreambooth.py", line 431, in main
    vae=AutoencoderKL.from_pretrained(args.pretrained_vae_name_or_path or args.pretrained_model_name_or_path),
  File "/home/MY_USERNAME/anaconda3/envs/diffusers2/lib/python3.9/site-packages/diffusers/modeling_utils.py", line 321, in from_pretrained
    raise EnvironmentError(
OSError: Error no file named diffusion_pytorch_model.bin found in directory /home/MY_USERNAME/github/models/stable-diff/v1-4/transformed.
Traceback (most recent call last):
  File "/home/MY_USERNAME/anaconda3/envs/diffusers2/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/home/MY_USERNAME/anaconda3/envs/diffusers2/lib/python3.9/site-packages/accelerate/commands/accelerate_cli.py", line 43, in main
    args.func(args)
  File "/home/MY_USERNAME/anaconda3/envs/diffusers2/lib/python3.9/site-packages/accelerate/commands/launch.py", line 837, in launch_command
    simple_launcher(args)
  File "/home/MY_USERNAME/anaconda3/envs/diffusers2/lib/python3.9/site-packages/accelerate/commands/launch.py", line 354, in simple_launcher
InB4DevOps commented 1 year ago

Sorry man, I tried this only with the 1.5 file - please try that one.

You can try to delete everything in /home/username/.cache/huggingface/ but you will have to run accelerate config again

petekay commented 1 year ago

thanks, I really don't know what changed, but for me it is no more replicable:

echo "########################################"
echo "delete cache"
cd ~/.cache/huggingface
rm -r diffusers
rm -r accelerate
rm -r hub
ls -A
read -t 3 -p "Cache deleted"

accelerate config

huggingface-cli login

echo ""
echo "########################################"
echo "if exsits, delete the transformed model"
cd ~/github/models/stable-diff/v1-5/
rm -r transformed
mkdir transformed
read -t 3 -p "(Re)-created transformed model "

echo ""
echo "########################################"
echo "go to ShivamShrirao scripts folder" 
cd ~/github/diffusers/scripts 
git remote -v 
read -t 3 -p "Check folder and repo "

echo ""
echo "########################################"
echo "convert_original_stable_diffusion_to_diffusers" 
python ./convert_original_stable_diffusion_to_diffusers.py \
  --checkpoint_path ~/github/models/stable-diff/v1-5/v1-5-pruned.ckpt \
  --dump_path ~/github/models/stable-diff/v1-5/transformed \
  --scheduler_type lms
#python ./convert_original_stable_diffusion_to_diffusers.py \
  --checkpoint_path ~/github/models/stable-diff/v1-4/sd-v1-4-full-ema.ckpt \
  --dump_path ~/github/models/stable-diff/v1-4/transformed \
  --scheduler_type lms
read -t 1 -p "Model transformed"

echo ""
echo "########################################"
echo "Run Dreambooth"
cd ~/github/diffusers/examples/dreambooth
cat my_train3.sh
./my_train3.sh

results to:

export PATH=/usr/local/cuda-11.7/bin:$PATH
#export MODEL_NAME="CompVis/stable-diffusion-v1-4"
#export MODEL_NAME="runwayml/stable-diffusion-v1-5"
export MODEL_NAME="/home/MY_USERNAME/github/models/stable-diff/v1-5/transformed"
export INSTANCE_DIR="SPECMODEL/training"
export CLASS_DIR="SPECMODEL/classes"
export OUTPUT_DIR="SPECMODEL/model"

accelerate launch train_dreambooth.py \
  --pretrained_model_name_or_path=$MODEL_NAME \
  --instance_data_dir=$INSTANCE_DIR \
  --class_data_dir=$CLASS_DIR \
  --output_dir=$OUTPUT_DIR \
  --with_prior_preservation --prior_loss_weight=1.0 \
  --instance_prompt="photo of smnb person" \
  --class_prompt="photo of person" \
  --seed=1337 \
  --resolution=512 \
  --train_batch_size=1 \
  --sample_batch_size=1 \
  --gradient_accumulation_steps=1 \
  --gradient_checkpointing \
  --learning_rate=5e-6 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --use_8bit_adam \
  --num_class_images=50\
  --max_train_steps=800 \
  --mixed_precision="fp16"

The following values were not passed to `accelerate launch` and had defaults used instead:
        `--num_cpu_threads_per_process` was set to `8` to improve out-of-box performance
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.
Moving 0 files to the new cache system
0it [00:00, ?it/s]
Traceback (most recent call last):
  File "/home/MY_USERNAME/github/diffusers/examples/dreambooth/train_dreambooth.py", line 765, in <module>
    main()
  File "/home/MY_USERNAME/github/diffusers/examples/dreambooth/train_dreambooth.py", line 431, in main
    vae=AutoencoderKL.from_pretrained(args.pretrained_vae_name_or_path or args.pretrained_model_name_or_path),
  File "/home/MY_USERNAME/github/diffusers/src/diffusers/modeling_utils.py", line 321, in from_pretrained
    raise EnvironmentError(
OSError: Error no file named diffusion_pytorch_model.bin found in directory /home/MY_USERNAME/github/models/stable-diff/v1-5/transformed.

can you post your pip freeze for this topic?

InB4DevOps commented 1 year ago
absl-py==1.2.0
accelerate==0.12.0
aiohttp==3.8.3
aiosignal==1.2.0
antlr4-python3-runtime==4.9.3
async-timeout==4.0.2
attrs==22.1.0
bitsandbytes==0.34.0
cachetools==5.2.0
certifi @ file:///opt/conda/conda-bld/certifi_1663615672595/work/certifi
charset-normalizer==2.1.1
cmake==3.24.1.1
diffusers @ file:///home/greg/github/diffusers
filelock==3.8.0
frozenlist==1.3.1
fsspec==2022.10.0
ftfy==6.1.1
google-auth==2.12.0
google-auth-oauthlib==0.4.6
grpcio==1.49.1
huggingface-hub==0.10.0
idna==3.4
importlib-metadata==5.0.0
Jinja2==3.1.2
Markdown==3.4.1
MarkupSafe==2.1.1
modelcards==0.1.6
multidict==6.0.2
mypy-extensions==0.4.3
ninja==1.10.2.4
numpy==1.23.3
oauthlib==3.2.1
omegaconf==2.2.3
packaging==21.3
Pillow==9.2.0
protobuf==3.19.6
psutil==5.9.2
pyasn1==0.4.8
pyasn1-modules==0.2.8
pyDeprecate==0.3.2
pyparsing==3.0.9
pyre-extensions==0.0.23
pytorch-lightning==1.7.7
PyYAML==6.0
regex==2022.9.13
requests==2.28.1
requests-oauthlib==1.3.1
rsa==4.9
scipy==1.9.3
six==1.16.0
tensorboard==2.10.1
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
tokenizers==0.12.1
torch==1.12.1+cu116
torchaudio==0.12.1+cu116
torchmetrics==0.10.0
torchvision==0.13.1+cu116
tqdm==4.64.1
transformers==4.22.2
triton==2.0.0.dev20221003
typing-inspect==0.8.0
typing_extensions==4.3.0
urllib3==1.26.12
wcwidth==0.2.5
Werkzeug==2.2.2
xformers @ git+https://github.com/facebookresearch/xformers@1d31a3ac3b11f40fde7f00aa64debb0fd4d6f376
yarl==1.8.1
zipp==3.8.1
InB4DevOps commented 1 year ago

Not sure if this would really make a difference but I have a / at the end of my MODEL_NAME path - maybe try that?

InB4DevOps commented 1 year ago

Btw: this is the content of my converted model folder: please compare it with yours..

(base) username@powerpc:~/sd14/1.5dump$ tree
.
├── model_index.json
├── scheduler
│   └── scheduler_config.json
├── text_encoder
│   ├── config.json
│   └── pytorch_model.bin
├── tokenizer
│   ├── merges.txt
│   ├── special_tokens_map.json
│   ├── tokenizer_config.json
│   └── vocab.json
├── unet
│   ├── config.json
│   └── diffusion_pytorch_model.bin
└── vae
    ├── config.json
    └── diffusion_pytorch_model.bin

5 directories, 12 files
petekay commented 1 year ago

Not sure if this would really make a difference but I have a / at the end of my MODEL_NAME path - maybe try that?

I added this, and now it works:

--pretrained_vae_name_or_path=$VAE_DIR \

VAE_DIR = MODEL_DIR + "/VAE" (i just copied it and appended /VAE to it)

petekay commented 1 year ago

diffusers @ file:///home/greg/github/diffusers

my lookes like this:

-e git+https://github.com/ShivamShrirao/diffusers.git@fc0de43c8c0f2e9729f4a03ed6f4cf95f88bed46#egg=diffusers

your diffusers is also from ShivamShrirao (forked) and not the orignal one, correct?

InB4DevOps commented 1 year ago

diffusers @ file:///home/greg/github/diffusers

my lookes like this:

-e git+https://github.com/ShivamShrirao/diffusers.git@fc0de43c8c0f2e9729f4a03ed6f4cf95f88bed46#egg=diffusers

your diffusers is also from ShivamShrirao (forked) and not the orignal one, correct?

Yes, it's the fork.

Omegastick commented 1 year ago

I needed to upgrade my transformers package to 4.22.2. I'm doing a training run now, and it seems to be working.