[Feature Request]: Support SD Video (SVD-xt)

Is there an existing issue for this?

[X] I have searched the existing issues and checked the recent builds/commits

What would your feature do ?

I would like to request a feature that supports the new Stable Video Diffusion model, which can be found at this link: https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt. This new model represents a significant advancement in generative video technology and has the potential to greatly enhance the user experience.

The feature extension should include the following capabilities:

Ability to specify the number of frames: Users should be able to choose the desired number of frames in the generated video, allowing them to control the length and detail of the output.
Adjustable resolution: Provide the option to select the resolution of the generated video, giving users more control over the quality and file size of the output.
Upload pane for source image: Incorporate an upload pane for a source image, which can be used as a reference for the video model. This will enable users to easily input their desired image and create a video based on it.
. Integration with img2txt tab: Add a button that allows users to seamlessly pull images from the img2txt tab within the application, streamlining the process of creating a video from an existing image.

For reference, the sample code can be found at this link: https://github.com/Stability-AI/generative-models. I have lots of VRAM, would be happy to test.

Thank you for considering this feature request, and I look forward to the continued development and improvement of the sd-webui-text2video extension.

Proposed workflow

Press [txt2video]
Add Model Type [SVD-xt]
Inherit the same settings as the txt2vid sub tab.

Additional information

No response

kabachuha / sd-webui-text2video