Open DZY-irene opened 2 weeks ago
I noticed that the currently open-source model outputs a video with resolution of 480x848, while the videos generated in the playground have a resolution of 960x1696. Does this mean that the quality difference I've noticed is due to different models, or is it simply because I didn't use a longer prompt or a negative prompt?
They are definitely using a different prompt from what the user says (probably sending it to a LLM to enhance it) and then upscaling the generated video from 480p to whatever they give you. What model are you using fp8, fp16?
Thank you so much for your contributions to the Text2Video open-source community! I used the same short prompt with the Mochi model through both the CLI demo and the playground, but I noticed a slight difference in the video quality. Could you let me know if the playground includes an implicit prompt refiner or any negative prompts?
Here are my inputs and outputs:
prompt: a fantasy landscape from cli:
https://github.com/user-attachments/assets/cc86e04a-42ff-48ab-bea6-8ee802cee6f9
from playground:
https://github.com/user-attachments/assets/52282e79-ed04-48cb-8b25-cd00069441ab
prompt: alley from cli:
https://github.com/user-attachments/assets/9565bfe5-c891-439d-b291-2bd22617049d
from playground:
https://github.com/user-attachments/assets/24807861-8354-46c7-9804-8902891f1a52
I wonder about the methods you use to enhance video quality. Looking forward to your response!