TencentARC / MotionCtrl

Official Code for MotionCtrl [SIGGRAPH 2024]
https://wzhouxiff.github.io/projects/MotionCtrl/
Apache License 2.0
1.32k stars 70 forks source link

RuntimeError: The shape of the 2D attn_mask is torch.Size([77, 77]), but should be (1, 1). #31

Open aulaywang opened 3 months ago

aulaywang commented 3 months ago

When I use the command sh configs/inference/run.sh, this error occurs:

Traceback (most recent call last):                                                                                                                                                                                     
  File "/data/code/MotionCtrl/main/evaluation/motionctrl_inference.py", line 354, in <module>                                                                              
    run_inference(args, gpu_num, rank)                                                                                                                                                                                 
  File "/data/code/MotionCtrl/main/evaluation/motionctrl_inference.py", line 261, in run_inference                                                                         
    batch_samples = motionctrl_sample(
  File "/data/code/MotionCtrl/main/evaluation/motionctrl_inference.py", line 146, in motionctrl_sample
    cond = model.get_learned_conditioning(prompts)
  File "/data/code/MotionCtrl/main/evaluation/../../lvdm/models/ddpm3d.py", line 552, in get_learned_conditioning
    c = self.cond_stage_model.encode(c)
  File "/data/code/MotionCtrl/main/evaluation/../../lvdm/modules/encoders/condition2.py", line 238, in encode
    return self(text)
  File "/root/miniconda3/envs/motionctrl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/data/code/MotionCtrl/main/evaluation/../../lvdm/modules/encoders/condition2.py", line 215, in forward
    z = self.encode_with_transformer(tokens.to(self.device))
  File "/data/code/MotionCtrl/main/evaluation/../../lvdm/modules/encoders/condition2.py", line 222, in encode_with_transformer
    x = self.text_transformer_forward(x, attn_mask=self.model.attn_mask)
  File "/data/code/MotionCtrl/main/evaluation/../../lvdm/modules/encoders/condition2.py", line 234, in text_transformer_forward
    x = r(x, attn_mask=attn_mask)
  File "/root/miniconda3/envs/motionctrl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/miniconda3/envs/motionctrl/lib/python3.10/site-packages/open_clip/transformer.py", line 263, in forward
    x = q_x + self.ls_1(self.attention(q_x=self.ln_1(q_x), k_x=k_x, v_x=v_x, attn_mask=attn_mask))
  File "/root/miniconda3/envs/motionctrl/lib/python3.10/site-packages/open_clip/transformer.py", line 250, in attention
    return self.attn(
  File "/root/miniconda3/envs/motionctrl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/miniconda3/envs/motionctrl/lib/python3.10/site-packages/torch/nn/modules/activation.py", line 1167, in forward
    attn_output, attn_output_weights = F.multi_head_attention_forward(
  File "/root/miniconda3/envs/motionctrl/lib/python3.10/site-packages/torch/nn/functional.py", line 5069, in multi_head_attention_forward
    raise RuntimeError(f"The shape of the 2D attn_mask is {attn_mask.shape}, but should be {correct_2d_size}.")
RuntimeError: The shape of the 2D attn_mask is torch.Size([77, 77]), but should be (1, 1).
aulaywang commented 3 months ago

I have solved this problem by link pip install open-clip-torch==2.24.0

ShiMinghao0208 commented 1 month ago

I have solved this problem by link pip install open-clip-torch==2.24.0

When running the SDXL source code, I was mad by this bug, thanks!!!