Any descriptions on the dataset for pre-training?

THUDM / CogVideo

Text-to-video generation. The repo for ICLR2023 paper "CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers"

Apache License 2.0

3.54k stars 378 forks source link

Any descriptions on the dataset for pre-training? #8

Open zhoudaquan opened 1 year ago

zhoudaquan commented 1 year ago

Hi authors,

Congratulations on your great work! I have read through the paper. I found that there is no description on the source of dataset used for pre-training. Can you please share some information on which dataset or how you collect the dataset for pretraining?

Regards, DQ

Maxlinn commented 1 year ago

also interested in the dataset