Why does the image become increasingly blurry from the second

jy0205 / Pyramid-Flow

Code of Pyramidal Flow Matching for Efficient Video Generative Modeling

https://pyramid-flow.github.io/

MIT License

2.3k stars 224 forks source link

Why does the image become increasingly blurry from the second #178

Open gameandwhk111111111111111111 opened 3 days ago

gameandwhk111111111111111111 commented 3 days ago

May I ask, when I generate a 5-second video, why does the image become increasingly blurry and the picture quality decrease significantly from the 2nd second onwards? Is this just the model itself. And the image of objects with large motion amplitudes deteriorates instantly. FLUX768P

jy0205 commented 3 days ago

Hi, thanks for your attention! We guess that the miniflux 768p model have not been trained sufficiently. Apologies for that. Our recent resources have been occupied by other more important businesses, so the 768p model was only fine-tuned for a few steps based on the 384p version. Once the in-hand work is completed, we will continue to optimize the performance of miniflux 768p version. By the way, does Miniflux 384p have the similar issue？

jy0205 commented 3 days ago

What generation task did you try? Text-to-video or Image-to-Video?

yjhong89 commented 2 days ago

I experienced similar result (#177)

768p model shows blur and artifact as video goes while 368p model shows good result
Here is I2V samples I tested with the same text: "The portrait remains happy throughout the video clip. Starts with a wide smile, raising cheeks, tightening lids, and pulling up the upper lip. It then transitions into a jaw drop and slight dimpling, maintaining the lip raise."

<384p> https://github.com/user-attachments/assets/e4664979-99ca-40f9-85cd-62ca1ee1ada6 <768p> https://github.com/user-attachments/assets/0fdd1d70-e45c-4c4e-812d-3f002328eb93

gameandwhk111111111111111111 commented 2 days ago

First of all, thank you very much for such an excellent model. I encountered this situation in TuSheng Video because when running the 384P model, the video produced was very excellent. I look forward to the further improvement of the 768P model.