Open Closertodeath opened 1 week ago
Similarly, I2V means that the first frame of the video is the Image. The code will automatically capture it.
Similarly, I2V means that the first frame of the video is the Image. The code will automatically capture it.
Does any resolution work similarly to how i2v currently works or does it need to be a set resolution?
Need fixed, for example, CogVideoX1.0 is 720 * 480 Regarding CogVideoX1.5, it supports 768-1360 (long edge) and 768 short edge. However, there is currently no manpower available to invest in writing the specific fine-tuning code, and it is expected to continue using CogVideoX-Factory as the fine-tuning framework for open-source models.
Regarding CogVideoX1.5, it supports 768-1360 (long edge) and 768 short edge
Is vertical video (i.e. 768x1360) meant to be supported? It always becomes blurry when I try.
System Info / 系統信息
Linux, otherwise N/A
Information / 问题信息
Reproduction / 复现过程
Here it only provides information on how to prepare the dataset for text to video. There is no information for I2V.
Expected behavior / 期待表现
Information on how to prepare a dataset for image to video.