super-resolution - Githubissues

THUDM / CogVideo

Text-to-video generation. The repo for ICLR2023 paper "CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers"

Apache License 2.0

3.54k stars 378 forks source link

super-resolution #17

Closed B-Soul closed 1 year ago

B-Soul commented 1 year ago

作者您好，想问一下super-resolution这一步骤的意义和具体操作（在代码中我看到它是第二阶段的一部分），但是我在论文中没有找到对应的讲解。

谢谢。

wenyihong commented 1 year ago

您好，CogVideo初始生成的帧的分辨率是160160，super-resolution可以把其超分到480480。由于CogVideo使用的VQ-VAE解码器与CogView2相同，因此直接使用了CogView2的超分方法。具体可以参考 Ding, M., Zheng, W., Hong, W., & Tang, J. (2022). CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers. arXiv preprint arXiv:2204.14217.