nateraw / stable-diffusion-videos

Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts
Apache License 2.0
4.42k stars 421 forks source link

Stable diffusion 2-1 #129

Closed tralala87 closed 1 year ago

tralala87 commented 1 year ago

Is it possible to run this with stabilityai/stable-diffusion-2-1?

davidrs commented 1 year ago

+1 to requesting this guidance. Here are some more details on current attempts and error. I've not yet dug in to pipeline_utils to see how it was extracting pipeline args and how they changed between 1.4 and 2

import torch

from stable_diffusion_videos import StableDiffusionWalkPipeline, Interface

from diffusers import StableDiffusionPipeline, EulerDiscreteScheduler

model_id = "stabilityai/stable-diffusion-2"

pipeline = StableDiffusionWalkPipeline.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    revision="fp16",
).to("cuda")

interface = Interface(pipeline)

Throws error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
[<ipython-input-19-3739427ac16b>](https://localhost:8080/#) in <module>
      1 model_id = "stabilityai/stable-diffusion-2"
      2 
----> 3 pipeline = StableDiffusionWalkPipeline.from_pretrained(
      4     model_id,
      5     torch_dtype=torch.float16,

1 frames
[/usr/local/lib/python3.8/dist-packages/diffusers/pipeline_utils.py](https://localhost:8080/#) in from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
    672         elif len(missing_modules) > 0:
    673             passed_modules = set(list(init_kwargs.keys()) + list(passed_class_obj.keys())) - optional_kwargs
--> 674             raise ValueError(
    675                 f"Pipeline {pipeline_class} expected {expected_modules}, but only {passed_modules} were passed."
    676             )

ValueError: Pipeline <class 'stable_diffusion_videos.stable_diffusion_pipeline.StableDiffusionWalkPipeline'> expected {'vae', 'scheduler', 'unet', 'text_encoder', 'feature_extractor', 'tokenizer', 'safety_checker'}, but only {'vae', 'scheduler', 'unet', 'text_encoder', 'tokenizer'} were passed.
nateraw commented 1 year ago

taking a look

nateraw commented 1 year ago

hmm the first error goes away if I pass feature_extractor=CLIPFeatureExtractor() or feature_extractor=None, but now I'm getting black images. I think perhaps its either due to diffusers dep being out of date or the different scheduler they use...investigating.

nateraw commented 1 year ago

btw related issue here for 2.0 https://github.com/nateraw/stable-diffusion-videos/issues/124

nateraw commented 1 year ago

ah some insight here: https://huggingface.co/stabilityai/stable-diffusion-2-1/discussions/9#639b847134967bcf45576059

nateraw commented 1 year ago

Ok so you can successfully initialize it like this if you have the latest diffusers version installed (pip install --upgrade diffusers)

import torch

from stable_diffusion_videos import StableDiffusionWalkPipeline
from diffusers import DPMSolverMultistepScheduler

model_id = "stabilityai/stable-diffusion-2-1"

pipe = StableDiffusionWalkPipeline.from_pretrained(
    model_id,
    feature_extractor=None,
    safety_checker=None,
    revision="fp16",
    torch_dtype=torch.float16,
).to("cuda")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)

and generate images like this:

prompt = "a cat"
image = pipe(
    prompt,
    num_inference_steps=5,
    generator=torch.Generator(device='cuda').manual_seed(42),
    height=768,
    width=768
).images[0]

But the walk function of the pipeline currently will generate just black images due to the use of torch.autocast('cuda'). Need to remove it. If you run the same code but with autocast, like this, you'll see what I mean...

prompt = "a cat"

with torch.autocast('cuda'):
    image = pipe(
        prompt,
        num_inference_steps=5,
        generator=torch.Generator(device='cuda').manual_seed(42),
        height=768,
        width=768
    ).images[0]
image
nateraw commented 1 year ago

@davidrs if you're around feel free to try out the pull request above. This should work in colab:

! pip install git+https://github.com/nateraw/stable-diffusion-videos@remove-autocast

Init the pipe

import torch

from stable_diffusion_videos import StableDiffusionWalkPipeline
from diffusers import DPMSolverMultistepScheduler

model_id = "stabilityai/stable-diffusion-2-1"

pipe = StableDiffusionWalkPipeline.from_pretrained(
    model_id,
    feature_extractor=None,
    safety_checker=None,
    revision="fp16",
    torch_dtype=torch.float16,
).to("cuda")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)

Make videos

video_path = pipe.walk(
    ['a cat', 'a dog'],
    [12345, 54321],
)

I did all the edits from the browser haha so not sure I want to merge and make a release til its tested a little

davidrs commented 1 year ago

Excellent @nateraw thanks for the speedy reply. I have one animation running that I'll wait to finish, then I'll restart kernel and try the new branch!

davidrs commented 1 year ago

UPDATE: it is working, must have been stale deps, or autocast. sorry. Thanks!!!


For faster testing loop:

video_path = pipe.walk(
    ['a cat', 'a dog'],
    [12345, 54321],
    num_inference_steps=20,
    num_interpolation_steps=2,
)
nateraw commented 1 year ago

Awesome. merged and made a release. Thanks for checking it so quick!!