Open WuTao-CS opened 1 year ago
mkdir checkpoints cd checkpoints
wget https://huggingface.co/stabilityai/stable-video-diffusion-img2vid/resolve/main/svd.safetensors
mkdir checkpoints cd checkpoints wget https://huggingface.co/stabilityai/stable-video-diffusion-img2vid/resolve/main/svd.safetensors?download=true
Thank you, But this is the image to video model, I'm asking the text to video model.
text to video isnt out yet
mkdir checkpoints cd checkpoints wget huggingface.co/stabilityai/stable-video-diffusion-img2vid/resolve/main/svd.safetensors?download=true
Thank you, But this is the image to video model, I'm asking the text to video model.
i think its easy to combie diffusers and image to video to do this
Yes I'm I think it's easy to create such pipeline:
maybe that's how they do it in the demo video.
So has anyone managed to run it? Even image to video?
The paper they released doesn't indicate that there will be a text-to-video model. It seems the intention is to combine image-to-video models with traditional text-to-image models to generate the initial frame.
From the paper:
Finally, many recent works tackle the task of image-to-video synthesis, where the start frame is already given and the model has to generate the consecutive frames [30, 93, 108]. Importantly, as shown in our work (see Figure 1) when combined with off-the-shelf text-to-image models, image-to-video models can be used to obtain a full text-(to-image)-to-video pipeline.
So has anyone managed to run it? Even image to video?
Yup... it works. After you install the package and prepare the env following the instructions, You need to download the model as mentioned by @crapthings :
mkdir checkpoints
cd checkpoints
wget https://huggingface.co/stabilityai/stable-video-diffusion-img2vid/resolve/main/svd.safetensors
Then run the streamlit, change streamlit run scripts/demo/sampling.py --server.port <your_port>
if you are running this on a remote machine, make sure to
tunnel
Then navigate your browser to:
localhost:
Example:
run:
streamlit run scripts/demo/sampling.py --server.port 8888
Navigate to:
localhost:8888/
@CyberTimon but how would you control what happens in the video ?
Hey @mayank64ce, I'm sorry but I can't tell you this. I'm not that experienced with stable video etc..
The technical paper on my side is quite misleading regarding the text-to-video part. By default, we assume the codes are aligned with what is claimed but unfortunately, it's currently not the case.
Exciting work! May I ask where the text-to-video model mentioned and used in the paper can be obtained? I only saw the waitlist to access a new upcoming web. Is there any open source plan?