[Bug]: Exception occurred: memory_efficient_attention() got an unexpected keyword argument 'scale'

johari3275 commented 1 year ago

Is there an existing issue for this?

[X] I have searched the existing issues and checked the recent builds/commits of both this extension and the webui

Are you using the latest version of the extension?

[X] I have the modelscope text2video extension updated to the lastest version and I still have the issue.

What happened?

Bug occurred upon trying to generate the video

Steps to reproduce the problem

using the standard default parameters, and prompt such as "a cute cat"

Steps = 30 CFG = 7 Width = 256 and Height = 256

What should have happened?

the video should be generated

WebUI and Deforum extension Commit IDs

webui commit id - [22bcc7be] txt2vid commit id - 9b79cb8d3ab44de883c5ffafd89dd708f251458a

What GPU were you using for launching?

GEFORCE RTX 3070 8GB VRAM

On which platform are you launching the webui backend with the extension?

Local PC setup (Windows)

Settings

default settings

Console logs

text2video — The model selected is:  ModelScope
 text2video extension for auto1111 webui
Git commit: 9b79cb8d (Sun Apr 16 12:17:20 2023)
Starting text2video
Pipeline setup
config namespace(framework='pytorch', task='text-to-video-synthesis', model={'type': 'latent-text-to-video-synthesis', 'model_args': {'ckpt_clip': 'open_clip_pytorch_model.bin', 'ckpt_unet': 'text2video_pytorch_model.pth', 'ckpt_autoencoder': 'VQGAN_autoencoder.pth', 'max_frames': 16, 'tiny_gpu': 1}, 'model_cfg': {'unet_in_dim': 4, 'unet_dim': 320, 'unet_y_dim': 768, 'unet_context_dim': 1024, 'unet_out_dim': 4, 'unet_dim_mult': [1, 2, 4, 4], 'unet_num_heads': 8, 'unet_head_dim': 64, 'unet_res_blocks': 2, 'unet_attn_scales': [1, 0.5, 0.25], 'unet_dropout': 0.1, 'temporal_attention': 'True', 'num_timesteps': 1000, 'mean_type': 'eps', 'var_type': 'fixed_small', 'loss_type': 'mse'}}, pipeline={'type': 'latent-text-to-video-synthesis'})
device cuda
Working in txt2vid mode
  0%|                                                                                                                                                                | 0/1 [00:00<?, ?it/s]latents torch.Size([1, 4, 24, 32, 32]) tensor(0.0021, device='cuda:0') tensor(0.9995, device='cuda:0')
DDIM sampling:   0%|                                                                                                                                                | 0/31 [00:00<?, ?it/s]
Traceback (most recent call last):                                                                                                                                  | 0/31 [00:00<?, ?it/s]
  File "D:\StableDiffusion\automatic1111/extensions/sd-webui-text2video/scripts\t2v_helpers\render.py", line 24, in run
    vids_pack = process_modelscope(args_dict)
  File "D:\StableDiffusion\automatic1111/extensions/sd-webui-text2video/scripts\modelscope\process_modelscope.py", line 193, in process_modelscope
    samples, _ = pipe.infer(args.prompt, args.n_prompt, args.steps, args.frames, args.seed + batch if args.seed != -1 else -1, args.cfg_scale,
  File "D:\StableDiffusion\automatic1111/extensions/sd-webui-text2video/scripts\modelscope\t2v_pipeline.py", line 245, in infer
    x0 = self.diffusion.ddim_sample_loop(
  File "D:\StableDiffusion\automatic1111\venv\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "D:\StableDiffusion\automatic1111\extensions\sd-webui-text2video\scripts\modelscope\t2v_model.py", line 1470, in ddim_sample_loop
    xt = self.ddim_sample(xt, t, model, model_kwargs, clamp,
  File "D:\StableDiffusion\automatic1111\venv\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "D:\StableDiffusion\automatic1111\extensions\sd-webui-text2video\scripts\modelscope\t2v_model.py", line 1322, in ddim_sample
    _, _, _, x0 = self.p_mean_variance(xt, t, model, model_kwargs, clamp,
  File "D:\StableDiffusion\automatic1111\extensions\sd-webui-text2video\scripts\modelscope\t2v_model.py", line 1263, in p_mean_variance
    y_out = model(xt, self._scale_timesteps(t), **model_kwargs[0])
  File "D:\StableDiffusion\automatic1111\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "D:\StableDiffusion\automatic1111\extensions\sd-webui-text2video\scripts\modelscope\t2v_model.py", line 367, in forward
    x = self._forward_single(block, x, e, context, time_rel_pos_bias,
  File "D:\StableDiffusion\automatic1111\extensions\sd-webui-text2video\scripts\modelscope\t2v_model.py", line 430, in _forward_single
    x = self._forward_single(block, x, e, context,
  File "D:\StableDiffusion\automatic1111\extensions\sd-webui-text2video\scripts\modelscope\t2v_model.py", line 414, in _forward_single
    x = module(x, context)
  File "D:\StableDiffusion\automatic1111\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "D:\StableDiffusion\automatic1111\extensions\sd-webui-text2video\scripts\modelscope\t2v_model.py", line 666, in forward
    x = block(x)
  File "D:\StableDiffusion\automatic1111\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "D:\StableDiffusion\automatic1111\extensions\sd-webui-text2video\scripts\modelscope\t2v_model.py", line 733, in forward
    x = self.attn1(
  File "D:\StableDiffusion\automatic1111\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "D:\StableDiffusion\automatic1111\extensions\sd-webui-text2video\scripts\modelscope\t2v_model.py", line 496, in forward
    out = xformers.ops.memory_efficient_attention(
TypeError: memory_efficient_attention() got an unexpected keyword argument 'scale'
Exception occurred: memory_efficient_attention() got an unexpected keyword argument 'scale'

Additional information

No response

volotat commented 1 year ago

I just encounter the same error. I tried several things, but in the end removing 'venv' folder and restarting web-ui, to allow it to reinstall dependencies helped to solve it.

rezponze commented 1 year ago

I just updated the last version and I get the same error. Using a 3060 12GB VRAM. Mmm..I don't want to go the venv approach and re-install dependencies because I have torch 2 installed.

kabachuha / sd-webui-text2video