Closed javismiles closed 1 year ago
@javismiles, Would you please give the errors you're recieving, your environment, and which of the methods you're trying to use?
I am trying things manually with stabilityai/stable-diffusion-2-1 and seeing no issues at all yet.
@Atomic-Germ thank you very much for your message, this is the gist of what I'm using:
pipeline = StableDiffusionWalkPipeline.from_pretrained( "stabilityai/stable-diffusion-2-1", torch_dtype=torch.float16, revision="fp16", ).to("cuda")
and the error that appears immediately when running that is this:
ValueError: Pipeline <class 'stable_diffusion_videos.stable_diffusion_pipeline.StableDiffusionWalkPipeline'> expected {'feature_extractor', 'vae', 'scheduler', 'safety_checker', 'text_encoder', 'unet', 'tokenizer'}, but only {'text_encoder', 'unet', 'tokenizer', 'vae', 'scheduler'} were passed
what do you think? Im not running it in Gradio, Im running it manually in a jupyter notebook to have full control with things, but again the issue is super simple, I run that instruction above and get the error below
Set feature_extractor=None and safety_checker=None in from pretrained fn.
That should solve your problem.
@nateraw yes it works, thank you very much!
@nateraw sorry I have a new issue, so thanks to your change now generating single images works perfect with 2.1, using:
image = pipeline(prompt, height=448, width=800, guidance_scale=7.5, num_inference_steps=50,generator=generator).images[0]
however when I now try to do this (which is what I need):
video_path = pipeline.walk(
[
"whatever",
"whatever"
"whatever
],
[1111, 57,55], upsample=True,
fps=5,
num_interpolation_steps=90,
height=448,
width=800,
)
visualize_video_colab(video_path)
it generates everything black, in the "dream" folder, all images come out black
but if I do it single images with just pipeline single generations are perfect,
but if I try to use pipeline.walk to do multiple for interpolation
then in dreams folder all come out as black and video is also black
how can I fix it? thank you :)
@nateraw
so this works perfect and generates a great image with SD 2.1: prompt = "whatever" generator = torch.Generator("cuda").manual_seed(1111) image = pipeline(prompt, height=512, width=512, guidance_scale=7.5, num_inference_steps=50, generator=generator).images[0]
but the following generates an all black result with the very same prompt and same SD 2.1 pipeline, and there are no errors, it generates images but fully black:
video_path = pipeline.walk(
[
"whatever",
"whatever",
"whatever",
],
[1111, 343, 57], upsample=True,
fps=5,
num_interpolation_steps=5,
height=512,
width=512,
)
and what I ultimately need is to do the walk for interpolation, let me know how can I fix it thank you :)
@nateraw and this is how I initialized the pipeline:
from stable_diffusion_videos import StableDiffusionWalkPipeline, Interface
pipeline = StableDiffusionWalkPipeline.from_pretrained( "stabilityai/stable-diffusion-2-1", feature_extractor=None, safety_checker=None, torch_dtype=torch.float16, revision="fp16", ).to("cuda")
interface = Interface(pipeline)
in case it can help this is how the class gets defined in my code:
<bound method StableDiffusionWalkPipeline.walk of StableDiffusionWalkPipeline { "_class_name": "StableDiffusionWalkPipeline", "_diffusers_version": "0.11.1", "feature_extractor": [ null, null ], "safety_checker": [ null, null ], "scheduler": [ "diffusers", "DDIMScheduler" ], "text_encoder": [ "transformers", "CLIPTextModel" ], "tokenizer": [ "transformers", "CLIPTokenizer" ], "unet": [ "diffusers", "UNet2DConditionModel" ], "vae": [ "diffusers", "AutoencoderKL" ] }
Yea interesting. I actually ran into this last night as well. Don't believe it was happening before...not 100% sure of that though.
My guess is it has to do with the scheduler and perhaps how we're handling scheduling here. Will investigate.
In the meantime, maybe give stable diffusion 2.1 base a try.
btw as an aside, I'm trying to fix the init here so we don't have to do the feature_extractor=None
business in from pretrained. Made separate issue for it at #165
@nateraw thank you for the reply, interesting, I cross fingers that hopefully you find the way to fix it :) when you say "maybe give stable diffusion 2.1 base a try", what do you mean? I already used SD 2.1 without problems, to produce static images, all good, the problem is when trying to produce with it the interpolated vids :)
I just fixed the from_pretrained issue on main branch with #165 . If you install the main branch of the repo, perhaps the following script will work for you:
pip install git+https://github.com/nateraw/stable-diffusion-videos
import torch
from stable_diffusion_videos import StableDiffusionWalkPipeline
from diffusers import DPMSolverMultistepScheduler
device = "mps" if torch.backends.mps.is_available() else "cuda" if torch.cuda.is_available() else "cpu"
torch_dtype = torch.float16 if device == "cuda" else torch.float32
pipe = StableDiffusionWalkPipeline.from_pretrained(
"stabilityai/stable-diffusion-2-1",
torch_dtype=torch_dtype,
).to(device)
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe.walk(
prompts=['a cat', 'a dog'],
seeds=[1234, 4321],
num_interpolation_steps=5,
num_inference_steps=50,
fps=5,
)
pls let me know :)
I'll copy this into colab and try it when I get a few mins
Ok I just threw it in colab and it seemed to work.
Or gist link directly if you prefer to read it that way.
@nateraw yes indeed, it works :) now I tried it properly and it works indeed, thank you very much :)
@nateraw it works just now it throws out a lot of these warning messages like:
Forward upsample size to force interpolation output size. Forward upsample size to force interpolation output size.
is it possible to hide them in some way? thank you :)
Ah this may be because of #156 , I changed the logging verbosity to INFO by default in that file.
Should be able to suppress by changing the diffusers logging verbosity.
Throw this at the top of your script:
from diffusers.utils import logging
logging.set_verbosity_warning()
Refer to docs here for more info.
If this issue is resolved please close it 😄
Feel free to open more issues if you run into anything else or have more questions!! ❤️
@nateraw thank you for your fantastic help :)
I just fixed the from_pretrained issue on main branch with #165 . If you install the main branch of the repo, perhaps the following script will work for you:
pip install git+https://github.com/nateraw/stable-diffusion-videos
import torch from stable_diffusion_videos import StableDiffusionWalkPipeline from diffusers import DPMSolverMultistepScheduler device = "mps" if torch.backends.mps.is_available() else "cuda" if torch.cuda.is_available() else "cpu" torch_dtype = torch.float16 if device == "cuda" else torch.float32 pipe = StableDiffusionWalkPipeline.from_pretrained( "stabilityai/stable-diffusion-2-1", torch_dtype=torch_dtype, ).to(device) pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config) pipe.walk( prompts=['a cat', 'a dog'], seeds=[1234, 4321], num_interpolation_steps=5, num_inference_steps=50, fps=5, )
pls let me know :)
I'll copy this into colab and try it when I get a few mins
If I use this instructions, the pipeline walks properly but produce only "random" images (like the ones in a kaleidoscope). Does somebody know why?
Hi everybody, is there a way to make this great repo work with the latest stable diffusion 2 version? I tried to replace the model id with: stabilityai/stable-diffusion-2 but then I get errors when using some of the library methods (which doesn't happen when using 1.4 or 1.5 models) thank you for any tips