Closed javismiles closed 1 year ago
Literally all ya have to do is replace the model ID in the from pretrained fn in most cases. If the model is from SD v2+ you have to pass safety_checker=None and feature_extractor=None in from pretrained as well I think, but other than that everything should work fine.
I'm out rn but when I get back to a laptop I'll shoot over a code snippet
thank you very much, the model is 1.5 although may do it later on with 2.1 as well, but I still dont see how to replace the model ID, if my model is local I will have to somehow refer to my local disk
checking this page: https://huggingface.co/docs/diffusers/using-diffusers/loading
they seem to suggest the following: a) to do this locally: git lfs install git clone https://huggingface.co/runwayml/stable-diffusion-v1-5
b) and then to put my local checkpoint inside that folder
c) and then sort of do like this: repo_id = "./stable-diffusion-v1-5" stable_diffusion = DiffusionPipeline.from_pretrained
to load from the local, and I guess I could just replace the ckpt of that folder with mine, and maybe that will work?
otherwise I await any other suggestions you have, thank you again :)
maybe your code snippet will make it all much easier, maybe it will allow me to simply reference my local disk for the new checkpoint of the 1.5 model, crossing fingers and looking forwards to your code snippet :)
I tried the way above but it didnt work:
pipeline = StableDiffusionWalkPipeline.from_pretrained( "/home/projects/ai/pr71/sd/stable-diffusion-v1-5", torch_dtype=torch.float16, revision="fp16", ).to("cuda")
it does execute perfectly, but despite me replacing the checkpoints in the downloaded repo with my dreambooth finetuned checkout, the code is still loading the standard 1.5, not mine, it must detect something, or maybe its loading it from somewhere else
hopefully your new code will help :)
Here's an example of a dreambooth fine-tuned SD v1.5.1 from lambda labs that generates avatar images.
To load it:
import torch
from stable_diffusion_videos import StableDiffusionWalkPipeline
# Dreambooth 1.5.1 finetuned by lambda labs
pipe = StableDiffusionWalkPipeline.from_pretrained(
"lambdalabs/dreambooth-avatar",
torch_dtype=torch.float16,
feature_extractor=None,
safety_checker=None
).to('cuda')
Then I created a quick 1 second video like this
video_path = pipe.walk(
["Snoop Dogg, avatarart style", "Snoop Dogg, avatarart style"],
[42, 1234],
num_interpolation_steps=30,
batch_size=4, # I'm on premium colab runtime. reduce to 1 if you are not.
)
Try it yourself in colab
And here is the result:
thats awesome Nathan and my dreambooth model also works perfect for me, my problem continues being loading it from my local disk, not from the hub like "lambdalabs/dreambooth-avatar", how can I load it from my local disk?
hmm interesting. I actually have never tried I don't think...I would have expected it to work fine.
Your directory structure is the same as what you see on the hugging face hub repos for these models?
thank you, yes, same, basically I followed this advice: https://huggingface.co/docs/diffusers/using-diffusers/loading
I downloaded the entire stable diffusion 1.5 repo folder, and then replaced the ckpt with my dreambooth finetuned one
then I refer to that folder with local path
and it executes, no errors, but continues to run the standard 1.5, not mine
well thats the only way I see, otherwise I would have to somehow put my finetuned model I guess inside the hugging face hub but this is really not practical for my experiments, there must be a way to load it from my local disk
by the way, did you upsample that video result? looks great, by adding ", upsample=True)" to the pipeline instruction it comes out like that upsampled, the video? thats awesome
huh, weird. When I trained mine, I didn't have to download anything. I just used the dreambooth script from diffusers and it created the nice folder structure for me. No manual copy pasting, etc.
I'm assuming you used some other implementation of dreambooth? Probably the one that uses lightning, which would create a ckpt file
Can you load the model with diffusers.StableDiffusionPipeline
successfully? If not, I think it might be better to ask over in that repo as our friends there can probably better assist.
I have done dreambooth locally on my linux previously, but I needed to do quick tests with a variety of models and other options so I used this: https://openart.ai/ , which produces a .ckpt indeed
Im loading the model like this:
pipeline = StableDiffusionWalkPipeline.from_pretrained( "/home/projects/ai/test71/sd/stable-diffusion-v1-5", torch_dtype=torch.float16, revision="fp16", ).to("cuda")
which runs well with no errors
and then I can produce the video, but its not applying my fine tuned model, but the standard 1.5
by the way, back to the snoop dogg example you shared, Im looking at : https://huggingface.co/lambdalabs/dreambooth-avatar/tree/main
and I see the code of the architecture, but where is the checkpoint file? is it private, or where is it in there as I cannot see it,
Im trying to understand where in those repos is the checkpoint file pointed to or linked to,
at least I found some new info: it is never loading a .ckpt, it is loading .bin files, for example for the unet this one: diffusion_pytorch_model.bin thats what its loading, these .bin files have same size as the .ckpt
I wonder if I need some kind of a converter to convert my .ckpt to this .bin format, any ideas?
cause basically loading SD 1.5 from my local disk is working perfect, and if I delete that .bin file then it fails with error looking for that file, so I have verified that I can run SD 1.5 all well locally in this way,
I just need to know how to replace this .bin file with my .ckpt
@apolinario do you know? I assume this is a very easy problem to solve.
@javismiles please let me know the repo you used to train w/ dreambooth 😄
@nateraw I found the solution, for others that may be looking for the same, here it is:
a) download this script: https://raw.githubusercontent.com/huggingface/diffusers/main/scripts/convert_original_stable_diffusion_to_diffusers.py
b) put your dreambooth fine tuned model in a new folder
c) then run the script like this: python conv.py --checkpoint_path ./dreamboothmodel.ckpt --dump_path . The script will create a folder structure with everything necessary to run the model in diffusers
d) then you can load that model with:
pipeline = StableDiffusionWalkPipeline.from_pretrained( "/local-path-to-the-folder-you-just-created", torch_dtype=torch.float16, revision="fp16", ).to("cuda")
https://openart.ai/
and voila, it works :)
And Nathan, to train, for quick experiments I used: https://openart.ai/ and locally I used a few of the repos recommended by some podcasters, here is one that I used a lot: https://colab.research.google.com/drive/1-HIbslQd7Ei_mAt25ipqSUMvbe3POm98?usp=sharing which is based on: https://github.com/TheLastBen/fast-stable-diffusion
and I just tested upsample in your code and works great yeah, thank you very much for your great help and support, very grateful :)
curious about one thing,
in the function that creates the video that we can then download
mp4 = open(video_path,'rb').read() data_url = "data:video/mp4;base64," + b64encode(mp4).decode()
is there any way, extra parameter etc to specify the quality of the mp4, the bitrate? that would be awesome
Actually if you disect the make_video_pyav
fn you can inject some ffmpeg commands to do this. I don't think I have it parametrized to do so, though.
So my suggestion for now would be to just run some ffmpeg command on the output file that does what you're looking for
If all good with you, can we close this issue? If you have others, please feel free to open another one
yes, great suggestion, and yes indeed lets close the issue, thank you again for your help :)
hi friends, say that instead of loading the standard model:
pipeline = StableDiffusionWalkPipeline.from_pretrained( "CompVis/stable-diffusion-v1-4", torch_dtype=torch.float16, revision="fp16", ).to("cuda")
I want to use a dreambooth fine tuned model, finetuned to the pics of a person etc, how do I go about loading my own ckpt checkpoint stable diffusion model rather than the hugging face one, maintaining the rest of the stable_diffusion_videos.ipynb the same?
thank you very much