Vchitect / Latte

Latte: Latent Diffusion Transformer for Video Generation.
Apache License 2.0
1.64k stars 171 forks source link

Results & ckpts of different sized Latte on UCF-101 #101

Closed yizenghan closed 1 week ago

yizenghan commented 1 month ago

Hi there, could you report the FVD/IS results of different-sized Latte models on UCF-101?

If possible, the pre-trained checkpoints would be useful. Thanks!

maxin-cn commented 1 month ago

We did not train models of different sizes on UCF101. We have only trained models of different sizes on FFS. You can find pre-trained checkpoints at here.

yizenghan commented 1 month ago

Hi, may I ask about the number of training iters of the released ckpts? In the paper I found the plots illustrate training 150k iters. In the code I find the max training steps are 1e6.

maxin-cn commented 1 month ago

Hi, may I ask about the number of training iters of the released ckpts? In the paper I found the plots illustrate training 150k iters. In the code I find the max training steps are 1e6.

I kind of forget exactly how many iterations. But according to the results of other people's replications, it's about 8 A100 (80G) GPU training for one week can achieve the value reported in the paper.

github-actions[bot] commented 2 weeks ago

Hi There! 👋

This issue has been marked as stale due to inactivity for 14 days.

We would like to inquire if you still have the same problem or if it has been resolved.

If you need further assistance, please feel free to respond to this comment within the next 7 days. Otherwise, the issue will be automatically closed.

We appreciate your understanding and would like to express our gratitude for your contribution to Latte. Thank you for your support. 🙏