Using vae-ft-mse-840000-ema-pruned.ckpt as cusom VAE?

ShivamShrirao / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch

https://huggingface.co/docs/diffusers

Apache License 2.0

1.89k stars 505 forks source link

Using vae-ft-mse-840000-ema-pruned.ckpt as cusom VAE? #195

Open jadechip opened 1 year ago

jadechip commented 1 year ago

Hi there! Most custom models I've come across seem to recommend using vae-ft-mse-840000-ema-pruned.ckpt as the VAE. However I can't seem to find a way to load .ckpt files using the --pretrained_vae_name_or_path arg. Dreambooth only seems to work with ".bin" files or am I mistaken? Do I need to convert ckpt files to .bin? Thank you so much!

futurevessel commented 1 year ago

Pretty sure you need to convert it into 'diffusers' format, so assuming you have a terminal open in the diffusers/examples/dreambooth/ directory, you should be able to do:

python ../../scripts/convert_original_stable_diffusion_to_diffusers.py --checkpoint_path vae-ft-mse-840000-ema-pruned.ckpt --dump_path sd-v15-vae/

and then set --pretrained_vae_name_or_path=sd-v15-vae/

jadechip commented 1 year ago

@futurevessel thank you for your response. I'm trying to get the script to run but running into some errors:

!python ./convert_original_stable_diffusion_to_diffusers.py --checkpoint_path ./sd-vae-ft-mse-original/vae-ft-mse-840000-ema-pruned.ckpt --dump_path sd-v15-vae/

2023-01-29 14:22:40 (39.0 MB/s) - ‘v1-inference.yaml’ saved [1873/1873]
Traceback (most recent call last):
  File "./convert_original_stable_diffusion_to_diffusers.py", line 716, in <module>
    converted_unet_checkpoint = convert_ldm_unet_checkpoint(
  File "./convert_original_stable_diffusion_to_diffusers.py", line 326, in convert_ldm_unet_checkpoint
    new_checkpoint["time_embedding.linear_1.weight"] = unet_state_dict["time_embed.0.weight"]
KeyError: 'time_embed.0.weight'

I'm guessing it's not finding the unet keys because I am passing in a checkpoint of a VAE without the UNET stuff?

jadechip commented 1 year ago

I might be able to comment out the code not related to the VAE in the ./convert_original_stable_diffusion_to_diffusers.pyscript 🤔

futurevessel commented 1 year ago

Hmm, pretty sure this is how I did it, but I get the same error now, so perhaps something has changed or I misremembered.

Have you tried having it download the vae from HuggingFace instead, by setting:

--pretrained_vae_name_or_path="stabilityai/sd-vae-ft-mse"

jadechip commented 1 year ago

@futurevessel Yes, downloading the default VAE from stabilityai/sd-vae-ft-mse works as that repo has the VAE in ".bin" format. However in the stabilityai/sd-vae-ft-mse-original repo the VAE is stored as vae-ft-mse-840000-ema-pruned.ckpt and so it throws an error when I try to run training :/

jadechip commented 1 year ago

Here is the error: Entry Not Found for url: https://huggingface.co/stabilityai/sd-vae-ft-mse-original/resolve/main/diffusion_pytorch_model.bin.

futurevessel commented 1 year ago

Ok, but if you just leave the --pretrained_vae_name_or_path="stabilityai/sd-vae-ft-mse" as is, it should load the VAE from the place where the download ended up, on my Linux system that is .cache/huggingface/diffusers/models--stabilityai--sd-vae-ft-mse/

I don't get an error when using --pretrained_vae_name_or_path="stabilityai/sd-vae-ft-mse" when I train. On the other hand, does this even affect training at all ? Because when I remove these VAE files and make it download them again, that download is only triggered when it starts generating preview samples, not when it starts training, meaning that the training done up until the first preview samples are generated, certainly doesn't make use of this VAE file.