[X] I have searched the existing issues and checked the recent builds/commits
What happened?
Using txt2video will make you black video
Steps to reproduce the problem
Go to txt2video
Write prompt
Get no errors, but the black pictures and video
What should have happened?
Video with chip spinning
Version or Commit where the problem happens
1.4.0
What Python version are you running on ?
Python 3.10.x
What platforms do you use to access the UI ?
Windows
What device are you running WebUI on?
Nvidia GPUs (GTX 16 below)
Cross attention optimization
Doggettx
What browsers do you use to access the UI ?
Google Chrome
Command Line Arguments
No
List of extensions
No
Console logs
Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug 1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Version: v1.4.0
Commit hash: 394ffa7b0a7fff3ec484bcd084e673a8b301ccc8
Installing requirements
Launching Web UI with arguments:
No module 'xformers'. Proceeding without it.
2023-07-12 21:20:14,731 - ControlNet - INFO - ControlNet v1.1.232
ControlNet preprocessor location: D:\stabledif\stable-diffusion-webui\extensions\sd-webui-controlnet\annotator\downloads
2023-07-12 21:20:14,868 - ControlNet - INFO - ControlNet v1.1.232
Loading weights [7eb674963a] from D:\stabledif\stable-diffusion-webui\models\Stable-diffusion\hassakuHentaiModel_v13.safetensors
*Deforum ControlNet support: enabled*
Creating model from config: D:\stabledif\stable-diffusion-webui\configs\v1-inference.yaml
LatentDiffusion: Running in eps-prediction mode
Running on local URL: http://127.0.0.1:7860
To create a public link, set `share=True` in `launch()`.
Startup time: 10.3s (import torch: 2.8s, import gradio: 1.6s, import ldm: 0.7s, other imports: 1.2s, setup codeformer: 0.1s, load scripts: 2.0s, create ui: 1.4s, gradio launch: 0.4s).
preload_extensions_git_metadata for 10 extensions took 0.55s
DiffusionWrapper has 859.52 M params.
Applying attention optimization: Doggettx... done.
Textual inversion embeddings loaded(0):
Model loaded in 10.7s (load weights from disk: 1.6s, create model: 0.9s, apply weights to model: 5.6s, apply half(): 1.5s, move model to device: 1.0s, calculate empty prompt: 0.2s).
text2video — The model selected is: zeroscope_v2_576w (ModelScope-like)
text2video extension for auto1111 webui
Git commit: 3f4a109a
Starting text2video
Pipeline setup
config namespace(framework='pytorch', task='text-to-video-synthesis', model={'type': 'latent-text-to-video-synthesis', 'model_args': {'ckpt_clip': 'open_clip_pytorch_model.bin', 'ckpt_unet': 'text2video_pytorch_model.pth', 'ckpt_autoencoder': 'VQGAN_autoencoder.pth', 'max_frames': 16, 'tiny_gpu': 1}, 'model_cfg': {'unet_in_dim': 4, 'unet_dim': 320, 'unet_y_dim': 768, 'unet_context_dim': 1024, 'unet_out_dim': 4, 'unet_dim_mult': [1, 2, 4, 4], 'unet_num_heads': 8, 'unet_head_dim': 64, 'unet_res_blocks': 2, 'unet_attn_scales': [1, 0.5, 0.25], 'unet_dropout': 0.1, 'temporal_attention': 'True', 'num_timesteps': 1000, 'mean_type': 'eps', 'var_type': 'fixed_small', 'loss_type': 'mse'}}, pipeline={'type': 'latent-text-to-video-synthesis'})
device cuda
Working in txt2vid mode
0%| | 0/1 [00:00<?, ?it/s]Making a video with the following parameters:
{'prompt': 'chip spinning', 'n_prompt': 'text, watermark, copyright, blurry, nsfw', 'steps': 5, 'frames': 8, 'seed': 2305832716, 'scale': 17, 'width': 256, 'height': 256, 'eta': 0.0, 'cpu_vae': 'GPU (half precision)', 'device': device(type='cuda'), 'skip_steps': 0, 'strength': 1, 'is_vid2vid': 0, 'sampler': 'DDIM_Gaussian'}
Sampling random noise.
Sampling using DDIM_Gaussian for 5 steps.: 100%|█████████████████████████████████████████| 5/5 [01:47<00:00, 21.41s/it]
STARTING VAE ON GPU. 8 CHUNKS TO PROCESS.: 100%|█████████████████████████████████████████| 5/5 [01:47<00:00, 18.39s/it]
VAE HALVED
DECODING FRAMES
VAE FINISHED
torch.Size([8, 3, 256, 256])
output/mp4s/20230712_212559624295.mp4
text2video finished, saving frames to D:\stabledif\stable-diffusion-webui\outputs/img2img-images\text2video\20230712212300
Got a request to stitch frames to video using FFmpeg.
Frames:
D:\stabledif\stable-diffusion-webui\outputs/img2img-images\text2video\20230712212300\%06d.png
To Video:
D:\stabledif\stable-diffusion-webui\outputs/img2img-images\text2video\20230712212300\vid.mp4
Stitching *video*...
Stitching *video*...
Video stitching done in 1.36 seconds!
t2v complete, result saved at D:\stabledif\stable-diffusion-webui\outputs/img2img-images\text2video\20230712212300
Loading weights [f2769b3f82] from D:\stabledif\stable-diffusion-webui\models\Stable-diffusion\after_sex.safetensors
Creating model from config: D:\stabledif\stable-diffusion-webui\configs\v1-inference.yaml
Additional information
I have base UI settings. I think settings of video don`t matter, i tried to do detailed forest and chip spinning in 20 frames, both didnt work. Thanks for help
Is there an existing issue for this?
What happened?
Using txt2video will make you black video
Steps to reproduce the problem
What should have happened?
Video with chip spinning
Version or Commit where the problem happens
1.4.0
What Python version are you running on ?
Python 3.10.x
What platforms do you use to access the UI ?
Windows
What device are you running WebUI on?
Nvidia GPUs (GTX 16 below)
Cross attention optimization
Doggettx
What browsers do you use to access the UI ?
Google Chrome
Command Line Arguments
List of extensions
No
Console logs
Additional information
I have base UI settings. I think settings of video don`t matter, i tried to do detailed forest and chip spinning in 20 frames, both didnt work. Thanks for help