huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
https://huggingface.co/docs/diffusers
Apache License 2.0
25.94k stars 5.35k forks source link

CogVideoX I2V: Missing guard-rail on num_frames #9467

Open tin2tin opened 1 month ago

tin2tin commented 1 month ago

Describe the bug

CogVideoX I2V: There is a warning when the negative_prompt is defined, but no warning when num_frames is defined, however, the latter will cause a faulty render, so maybe a guard-rail warning should be added to a defined num_frames value.

The faulty video: https://github.com/user-attachments/assets/5d3b4dd9-c9d2-43bb-b14c-bdd34b490bab

Reproduction

Just add a defined num_frames value to the inference code.

Logs

No response

System Info

Latest beta diff, Win 11

Who can help?

@DN6 @a-r-r-o-w

a-r-r-o-w commented 1 month ago

I'm having a bit of trouble understanding what's wrong here, sorry. Do you mean to say that there should be a num_frames == 49 restriction here?

tin2tin commented 1 month ago

Well, I had a lot of faulty renders until I realized that disabling num_frames made it work. The value might have been 48.

a-r-r-o-w commented 1 month ago

Yes, unfortunately the I2V model doesn't support anything other than 49 frames. I missed this initially so opened a patch PR to fix. Thanks for reporting!

github-actions[bot] commented 1 week ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.