nateraw / stable-diffusion-videos

Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts
Apache License 2.0
4.36k stars 417 forks source link

Using our own finetuned SD model with your repo #141

Closed javismiles closed 1 year ago

javismiles commented 1 year ago

hi friends, say that instead of loading the standard model: pipeline = StableDiffusionWalkPipeline.from_pretrained( "CompVis/stable-diffusion-v1-4", torch_dtype=torch.float16, revision="fp16", ).to("cuda")

I want to use a dreambooth fine tuned model, finetuned to the pics of a person etc, how do I go about loading my own ckpt checkpoint stable diffusion model rather than the hugging face one, maintaining the rest of the stable_diffusion_videos.ipynb the same?

thank you very much

nateraw commented 1 year ago

Literally all ya have to do is replace the model ID in the from pretrained fn in most cases. If the model is from SD v2+ you have to pass safety_checker=None and feature_extractor=None in from pretrained as well I think, but other than that everything should work fine.

nateraw commented 1 year ago

I'm out rn but when I get back to a laptop I'll shoot over a code snippet

javismiles commented 1 year ago

thank you very much, the model is 1.5 although may do it later on with 2.1 as well, but I still dont see how to replace the model ID, if my model is local I will have to somehow refer to my local disk

checking this page: https://huggingface.co/docs/diffusers/using-diffusers/loading

they seem to suggest the following: a) to do this locally: git lfs install git clone https://huggingface.co/runwayml/stable-diffusion-v1-5

b) and then to put my local checkpoint inside that folder

c) and then sort of do like this: repo_id = "./stable-diffusion-v1-5" stable_diffusion = DiffusionPipeline.from_pretrained

to load from the local, and I guess I could just replace the ckpt of that folder with mine, and maybe that will work?

otherwise I await any other suggestions you have, thank you again :)

javismiles commented 1 year ago

maybe your code snippet will make it all much easier, maybe it will allow me to simply reference my local disk for the new checkpoint of the 1.5 model, crossing fingers and looking forwards to your code snippet :)

javismiles commented 1 year ago

I tried the way above but it didnt work: pipeline = StableDiffusionWalkPipeline.from_pretrained( "/home/projects/ai/pr71/sd/stable-diffusion-v1-5", torch_dtype=torch.float16, revision="fp16", ).to("cuda")

it does execute perfectly, but despite me replacing the checkpoints in the downloaded repo with my dreambooth finetuned checkout, the code is still loading the standard 1.5, not mine, it must detect something, or maybe its loading it from somewhere else

hopefully your new code will help :)

nateraw commented 1 year ago

Here's an example of a dreambooth fine-tuned SD v1.5.1 from lambda labs that generates avatar images.

To load it:

import torch
from stable_diffusion_videos import StableDiffusionWalkPipeline

# Dreambooth 1.5.1 finetuned by lambda labs
pipe = StableDiffusionWalkPipeline.from_pretrained(
    "lambdalabs/dreambooth-avatar",
    torch_dtype=torch.float16,
    feature_extractor=None,
    safety_checker=None
).to('cuda')

Then I created a quick 1 second video like this

video_path = pipe.walk(
    ["Snoop Dogg, avatarart style", "Snoop Dogg, avatarart style"],
    [42, 1234],
    num_interpolation_steps=30,
    batch_size=4,  # I'm on premium colab runtime. reduce to 1 if you are not.
)

Try it yourself in colab

Open In Colab

And here is the result:

https://user-images.githubusercontent.com/32437151/212573645-7c3efc1e-8a3c-4f94-bc28-0d9e07df6aa8.mp4

javismiles commented 1 year ago

thats awesome Nathan and my dreambooth model also works perfect for me, my problem continues being loading it from my local disk, not from the hub like "lambdalabs/dreambooth-avatar", how can I load it from my local disk?

nateraw commented 1 year ago

hmm interesting. I actually have never tried I don't think...I would have expected it to work fine.

Your directory structure is the same as what you see on the hugging face hub repos for these models?

javismiles commented 1 year ago

thank you, yes, same, basically I followed this advice: https://huggingface.co/docs/diffusers/using-diffusers/loading

I downloaded the entire stable diffusion 1.5 repo folder, and then replaced the ckpt with my dreambooth finetuned one

then I refer to that folder with local path

and it executes, no errors, but continues to run the standard 1.5, not mine

well thats the only way I see, otherwise I would have to somehow put my finetuned model I guess inside the hugging face hub but this is really not practical for my experiments, there must be a way to load it from my local disk

by the way, did you upsample that video result? looks great, by adding ", upsample=True)" to the pipeline instruction it comes out like that upsampled, the video? thats awesome

nateraw commented 1 year ago

huh, weird. When I trained mine, I didn't have to download anything. I just used the dreambooth script from diffusers and it created the nice folder structure for me. No manual copy pasting, etc.

I'm assuming you used some other implementation of dreambooth? Probably the one that uses lightning, which would create a ckpt file


Can you load the model with diffusers.StableDiffusionPipeline successfully? If not, I think it might be better to ask over in that repo as our friends there can probably better assist.

javismiles commented 1 year ago

I have done dreambooth locally on my linux previously, but I needed to do quick tests with a variety of models and other options so I used this: https://openart.ai/ , which produces a .ckpt indeed

Im loading the model like this: pipeline = StableDiffusionWalkPipeline.from_pretrained( "/home/projects/ai/test71/sd/stable-diffusion-v1-5", torch_dtype=torch.float16, revision="fp16", ).to("cuda")

which runs well with no errors

and then I can produce the video, but its not applying my fine tuned model, but the standard 1.5

javismiles commented 1 year ago

by the way, back to the snoop dogg example you shared, Im looking at : https://huggingface.co/lambdalabs/dreambooth-avatar/tree/main

and I see the code of the architecture, but where is the checkpoint file? is it private, or where is it in there as I cannot see it,

Im trying to understand where in those repos is the checkpoint file pointed to or linked to,

javismiles commented 1 year ago

at least I found some new info: it is never loading a .ckpt, it is loading .bin files, for example for the unet this one: diffusion_pytorch_model.bin thats what its loading, these .bin files have same size as the .ckpt

I wonder if I need some kind of a converter to convert my .ckpt to this .bin format, any ideas?

cause basically loading SD 1.5 from my local disk is working perfect, and if I delete that .bin file then it fails with error looking for that file, so I have verified that I can run SD 1.5 all well locally in this way,

I just need to know how to replace this .bin file with my .ckpt

nateraw commented 1 year ago

@apolinario do you know? I assume this is a very easy problem to solve.

@javismiles please let me know the repo you used to train w/ dreambooth 😄

javismiles commented 1 year ago

@nateraw I found the solution, for others that may be looking for the same, here it is:

a) download this script: https://raw.githubusercontent.com/huggingface/diffusers/main/scripts/convert_original_stable_diffusion_to_diffusers.py

b) put your dreambooth fine tuned model in a new folder

c) then run the script like this: python conv.py --checkpoint_path ./dreamboothmodel.ckpt --dump_path . The script will create a folder structure with everything necessary to run the model in diffusers

d) then you can load that model with: pipeline = StableDiffusionWalkPipeline.from_pretrained( "/local-path-to-the-folder-you-just-created", torch_dtype=torch.float16, revision="fp16", ).to("cuda")https://openart.ai/

and voila, it works :)

And Nathan, to train, for quick experiments I used: https://openart.ai/ and locally I used a few of the repos recommended by some podcasters, here is one that I used a lot: https://colab.research.google.com/drive/1-HIbslQd7Ei_mAt25ipqSUMvbe3POm98?usp=sharing which is based on: https://github.com/TheLastBen/fast-stable-diffusion

javismiles commented 1 year ago

and I just tested upsample in your code and works great yeah, thank you very much for your great help and support, very grateful :)

javismiles commented 1 year ago

curious about one thing, in the function that creates the video that we can then download mp4 = open(video_path,'rb').read() data_url = "data:video/mp4;base64," + b64encode(mp4).decode()

is there any way, extra parameter etc to specify the quality of the mp4, the bitrate? that would be awesome

nateraw commented 1 year ago

Actually if you disect the make_video_pyav fn you can inject some ffmpeg commands to do this. I don't think I have it parametrized to do so, though.

So my suggestion for now would be to just run some ffmpeg command on the output file that does what you're looking for

nateraw commented 1 year ago

If all good with you, can we close this issue? If you have others, please feel free to open another one

javismiles commented 1 year ago

yes, great suggestion, and yes indeed lets close the issue, thank you again for your help :)