Closed suzukimain closed 1 year ago
Hi @suzukimain I believe you have to set the t0
and t1
arguments in the pipeline based on the number of inference steps.
See:
https://huggingface.co/docs/diffusers/v0.21.0/en/api/pipelines/text_to_video_zero#diffusers.TextToVideoZeroPipeline.__call__.t1
Hi @DN6 I fixed it based on your advice and it is fixed. Thank you very much for your advice.
Describe the bug
If I set
num_inference_steps
to 30 in txt2video, an error occurs, but if I set it to 60, no error occurs.Reproduction
import torch import imageio from diffusers import TextToVideoZeroPipeline import numpy as np
model_id = "runwayml/stable-diffusion-v1-5" pipe = TextToVideoZeroPipeline.from_pretrained(model_id, torch_dtype=torch.float16).to("cuda") seed = 0 video_length = 8 chunk_size = 4 prompt = "A panda is playing guitar on times square"
Generate the video chunk-by-chunk
result = [] chunk_ids = np.arange(0, video_length, chunk_size - 1) generator = torch.Generator(device="cuda")
num_inference_steps=30
for i in range(len(chunk_ids)): print(f"Processing chunk {i + 1} / {len(chunk_ids)}") ch_start = chunk_ids[i] ch_end = video_length if i == len(chunk_ids) - 1 else chunk_ids[i + 1]
Attach the first frame for Cross Frame Attention
Concatenate chunks and save
result = np.concatenate(result) result = [(r * 255).astype("uint8") for r in result] imageio.mimsave("video2.mp4", result, fps=4)
Logs
System Info
diffusers
version: 0.21.4Who can help?
No response