Vchitect / Latte

Latte: Latent Diffusion Transformer for Video Generation.
Apache License 2.0
1.45k stars 147 forks source link

preprocess dataset & t2v training #10

Open SKBL5694 opened 5 months ago

SKBL5694 commented 5 months ago

I saw in the README that it can be used to train two models, class-conditional and unconditional Latte, using the FaceForensics dataset. Do I need to do any additional preprocessing on the FaceForensics dataset? In what form should I organize the data in the FaceForensics dataset? In addition, how to train the t2v model?

maxin-cn commented 5 months ago

I saw in the README that it can be used to train two models, class-conditional and unconditional Latte, using the FaceForensics dataset. Do I need to do any additional preprocessing on the FaceForensics dataset? In what form should I organize the data in the FaceForensics dataset? In addition, how to train the t2v model?

Hi, please refer to the pre-processing method provided in stylegan-v to process FaceForensics. The dataloader for FaceForensics is provided in Latte. A parameter data_path in the ffs_train.yaml is /path/to/datasets/preprocess_ffs/train/videos/, which contains all videos in this folder.

As for the training code for T2V, please refer to https://github.com/maxin-cn/Latte/issues/4#issuecomment-1882255657.

huangjch526 commented 3 months ago

How to download the FFS dataset?The original way is broken, could somebody give me a drive link?

huangjch526 commented 3 months ago

求求你们了哥哥

maxin-cn commented 2 weeks ago

求求你们了哥哥

https://huggingface.co/datasets/maxin-cn/FaceForensics