xdit-project / xDiT

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) on multi-GPU Clusters
Apache License 2.0
592 stars 51 forks source link

Tensor size mismatch in CogVideoX transformer forward pass #260

Closed chen-yy20 closed 1 month ago

chen-yy20 commented 1 month ago

Hi, thanks for your great work! I encountered an error when trying to run the CogVideoX model on a single A800. The error occurs in the forward pass of the transformer, specifically when adding positional embeddings to the hidden states. The tensor sizes do not match, which suggests a potential issue with the model's implementation or configuration.

Here is the output log:

Additional notes:

Thanks for your help!

feifeibear commented 1 month ago

We are still working on cogvideoX. see our latest commit. and issue #265