kabachuha / sd-webui-text2video

Auto1111 extension implementing text2video diffusion models (like ModelScope or VideoCrafter) using only Auto1111 webui dependencies
1.28k stars 106 forks source link

[Bug]: Python crash #141

Closed flowskygge closed 1 year ago

flowskygge commented 1 year ago

Is there an existing issue for this?

Are you using the latest version of the extension?

What happened?

I get this message LLVM ERROR: Failed to infer result type(s). zsh: abort ./webui.sh /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d '

Steps to reproduce the problem

  1. Go to .... text2video extension for auto1111
  2. Press .... generate
  3. ... Python crash

What should have happened?

I should have a video

WebUI and Deforum extension Commit IDs

webui commit id - txt2vid commit id -

What GPU were you using for launching?


On which platform are you launching the webui backend with the extension?

Local PC setup (Mac)


Console logs

text2video extension for auto1111 webui
Git commit: 4fea1ada (Sun Apr 23 10:39:51 2023)
Starting text2video
Pipeline setup
config namespace(framework='pytorch', task='text-to-video-synthesis', model={'type': 'latent-text-to-video-synthesis', 'model_args': {'ckpt_clip': 'open_clip_pytorch_model.bin', 'ckpt_unet': 'text2video_pytorch_model.pth', 'ckpt_autoencoder': 'VQGAN_autoencoder.pth', 'max_frames': 16, 'tiny_gpu': 1}, 'model_cfg': {'unet_in_dim': 4, 'unet_dim': 320, 'unet_y_dim': 768, 'unet_context_dim': 1024, 'unet_out_dim': 4, 'unet_dim_mult': [1, 2, 4, 4], 'unet_num_heads': 8, 'unet_head_dim': 64, 'unet_res_blocks': 2, 'unet_attn_scales': [1, 0.5, 0.25], 'unet_dropout': 0.1, 'temporal_attention': 'True', 'num_timesteps': 1000, 'mean_type': 'eps', 'var_type': 'fixed_small', 'loss_type': 'mse'}}, pipeline={'type': 'latent-text-to-video-synthesis'})
device mps
Working in txt2vid mode
  0%|                                                                              | 0/1 [00:00<?, ?it/s]Making a video with the following parameters:
{'prompt': 'a dog on a skateboard', 'n_prompt': 'text, watermark, copyright, blurry, nsfw', 'steps': 30, 'frames': 24, 'seed': 134513184, 'scale': 17, 'width': 256, 'height': 256, 'eta': 0.0, 'cpu_vae': 'GPU (half precision)', 'device': device(type='mps'), 'skip_steps': 0, 'strength': 0}
loc("mps_add"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/ba9e12cd-4051-11ed-a503-b25c5e9b9057/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":228:0)): error: input types 'tensor<77x1x1024xf16>' and 'tensor<*xf32>' are not broadcast compatible
LLVM ERROR: Failed to infer result type(s).
/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '
zsh: abort      ./webui.sh

Additional information


github-actions[bot] commented 1 year ago

This issue has been closed due to incorrect formatting. Please address the following mistakes and reopen the issue: