THUDM / CogVideo

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Apache License 2.0
9.29k stars 874 forks source link

How to generate lower resolution videos with CogVideoX1.5-5B? #499

Open hitachinsk opened 1 week ago

hitachinsk commented 1 week ago

System Info / 系統信息

CUDA version: 12.4 Diffusers version: 0.32.0.dev0 Python version: 3.10 OS: Ubuntu

Information / 问题信息

Reproduction / 复现过程

  1. Download the model and the newly released codes.
  2. Modify the "sampling_image_size" in sat/configs/inference.yaml to lower resolution
  3. Run bash inference.sh

Expected behavior / 期待表现

Greetings! The performance of the newly released CogVideoX1.5-5B is amazing. However, when I tried to generate the videos with this model on lower resolution, the bottom of the generated videos is full of stripes. If I generate the videos on much lower resolutions (e.g. 480p or less), the generated videos only have the certain part of the whole scene. Here I show two videos. The upper is 1280x720, and the lower is 720x480. Could you tell me the reason of this phenomenon? I'm looking forward to your reply. Thank you very much.

https://github.com/user-attachments/assets/6ea70640-1cb3-4604-86a3-99284e781427

https://github.com/user-attachments/assets/d281c1a5-a4af-4747-b852-3449299ef44e