THUDM / CogVideo

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Apache License 2.0
8.12k stars 766 forks source link

执行i2v模式时,出现错误RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 16 but got size 8 for tensor number 1 in the list. #407

Closed hhue closed 1 week ago

hhue commented 1 week ago

System Info / 系統信息

cuda12.2,python3.10,linux redhat7.5。相关依赖包及版本:transformer4.45.2,diffusers>=0.30.3,accelerate>=0.34.0,imageio-ffmpeg>=0.5.1

Information / 问题信息

Reproduction / 复现过程

执行python cli_demo.py --prompt "A girl driver a car." --image_or_video_path "/data/llm/model/video/tmp/fpso.png" --generate_type "i2v" 出现如下错误 RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 16 but got size 8 for tensor number 1 in the list. 如何解决呢,谢谢!

Expected behavior / 期待表现

期望尽早解决,我自己也在找办法,谢谢

zRzRzRzRzRzRzR commented 1 week ago

Did you download the T2V model instead of the I2V model? Please check CogVideoX-5B-I2V

hhue commented 1 week ago

已解决,找到原因了,需要用CogVideoX-5b-I2V模型,谢谢