How to generate lower resolution videos with CogVideoX1.5-5B?

System Info / 系統信息

CUDA version: 12.4 Diffusers version: 0.32.0.dev0 Python version: 3.10 OS: Ubuntu

Information / 问题信息

[X] The official example scripts / 官方的示例脚本
[ ] My own modified scripts / 我自己修改的脚本和任务

Reproduction / 复现过程

Download the model and the newly released codes.
Modify the "sampling_image_size" in sat/configs/inference.yaml to lower resolution
Run bash inference.sh

Expected behavior / 期待表现

Greetings! The performance of the newly released CogVideoX1.5-5B is amazing. However, when I tried to generate the videos with this model on lower resolution, the bottom of the generated videos is full of stripes. If I generate the videos on much lower resolutions (e.g. 480p or less), the generated videos only have the certain part of the whole scene. Here I show two videos. The upper is 1280x720, and the lower is 720x480. Could you tell me the reason of this phenomenon? I'm looking forward to your reply. Thank you very much.

https://github.com/user-attachments/assets/6ea70640-1cb3-4604-86a3-99284e781427

https://github.com/user-attachments/assets/d281c1a5-a4af-4747-b852-3449299ef44e

THUDM / CogVideo