Vchitect / Latte

Latte: Latent Diffusion Transformer for Video Generation.
Apache License 2.0
1.65k stars 177 forks source link

How can I turn on the autoregressive mode to generate >16 frame videos? #42

Closed luweiblues closed 2 months ago

luweiblues commented 7 months ago
          > As the title.

Does it mean the current t2v model is not trained on other frame lengths and cannot generalize to other frame length?

Hi, producing videos directly with more than 16 frames can lead to low-quality output. To generate videos longer than 16 frames, you might consider using the autoregressive mode for better results.

Originally posted by @maxin-cn in https://github.com/Vchitect/Latte/issues/25#issuecomment-1960887682

maxin-cn commented 7 months ago
          > As the title.

Does it mean the current t2v model is not trained on other frame lengths and cannot generalize to other frame length?

Hi, producing videos directly with more than 16 frames can lead to low-quality output. To generate videos longer than 16 frames, you might consider using the autoregressive mode for better results.

Originally posted by @maxin-cn in #25 (comment)

Latte should not directly support autoregressive generation modes. You can consider using the method presented in FreeNoise.

github-actions[bot] commented 2 months ago

Hi There! 👋

This issue has been marked as stale due to inactivity for 60 days.

We would like to inquire if you still have the same problem or if it has been resolved.

If you need further assistance, please feel free to respond to this comment within the next 7 days. Otherwise, the issue will be automatically closed.

We appreciate your understanding and would like to express our gratitude for your contribution to Latte. Thank you for your support. 🙏