rhymes-ai / Allegro

Allegro is a powerful text-to-video model that generates high-quality videos up to 6 seconds at 15 FPS and 720p resolution from simple text input.
https://rhymes.ai/
Apache License 2.0
291 stars 14 forks source link

Any plan for data prepare code? #2

Open foreverpiano opened 1 day ago

foreverpiano commented 1 day ago

refer to paper chapter2

TikhonovDongqiudi commented 22 hours ago

Thanks for your attention. Due to the different data sources used by different developers, there is significant variation in data formats and structures. We do not provide code that is compatible with all data formats, nor do we standardize to any single data interface, as this would impose extra work on some developers to adapt their own data. Regarding the data processing methods, we have detailed all the steps in Section 2 of the tech report, including the specific methods used for each step. Some are described in detail (e.g., brightness), while others utilize tools from the open-source community (such as PyScendetect, LAION Aesthetic predictor, DOVER, etc.), all of which have corresponding implementations available. The filtering parameters used for each step can be found in Table 1 of the original text. You can easily set up a data pipeline tailored to your data format by referencing the relevant sections of the article and these open-source tools. We welcome further discussion.