Closed kabachuha closed 1 year ago
model is also on huggingface: https://huggingface.co/damo-vilab/modelscope-damo-text-to-video-synthesis/tree/main
I know, I linked it in the issue description
Correction: linked the space, not the model itself :)
On it: https://github.com/huggingface/diffusers/pull/2738
Hope to have it by Wednesday/Thursday
Closing it now, as implemented in https://github.com/huggingface/diffusers/pull/2738
Model/Pipeline/Scheduler description
Hello!
There seems to be a new 1.7B-parameter Diffusion-based model by ModelScope allowing text2video synthesis as noted by AKHaliq https://twitter.com/_akhaliq/status/1637321077553606657?s=20. Both the model implementation and weights (downloaded with their pipeline) are in open access and it's already possible to launch it via HuggingFace's spaces. However, the model lacks a lot of possible optimizations, especially concerning LowVRAM mode, and accessibility options, and I believe it would benefit greatly from the help of Diffusers community.
Example: monkey playing on drums
https://user-images.githubusercontent.com/14872007/226178634-d97b9782-a8fd-4dd1-989f-2544992a96b3.mp4
At this time the model should be fitting around 16 gbs of VRAM, but since it's a combination of 4 gb, 6 gb, and 5 gb models, I believe with half precision and sequential pipeline it will be eventually possible to launch it on modern consumer hardware.
The license is Apache-2.0 license, so there will be no problems with using the code as the reference.
Open source status
Provide useful links for the implementation
HuggingFace space:
https://huggingface.co/spaces/damo-vilab/modelscope-text-to-video-synthesis
All the parts of the model at HuggingFace:
https://huggingface.co/damo-vilab/modelscope-damo-text-to-video-synthesis/tree/main
The model PyTorch implementation:
https://github.com/modelscope/modelscope/tree/master/modelscope/models/multi_modal/video_synthesis
Google Colab from the devs:
https://colab.research.google.com/drive/1uW1ZqswkQ9Z9bp5Nbo5z59cAn7I0hE6R?usp=sharing
License: Apache-2.0 license