DiffSynth Studio is a Diffusion engine. We have restructured architectures including Text Encoder, UNet, VAE, among others, maintaining compatibility with models from the open-source community while enhancing computational performance. We provide many interesting features. Enjoy the magic of Diffusion models!
Until now, DiffSynth Studio has supported the following models:
June 21, 2024. 🔥🔥🔥 We propose ExVideo, a post-tuning technique aimed at enhancing the capability of video generation models. We have extended Stable Video Diffusion to achieve the generation of long videos up to 128 frames.
examples/ExVideo
.June 13, 2024. DiffSynth Studio is transferred to ModelScope. The developers have transitioned from "I" to "we". Of course, I will still participate in development and maintenance.
Jan 29, 2024. We propose Diffutoon, a fantastic solution for toon shading.
Dec 8, 2023. We decide to develop a new Project, aiming to release the potential of diffusion models, especially in video synthesis. The development of this project is started.
Nov 15, 2023. We propose FastBlend, a powerful video deflickering algorithm.
Oct 1, 2023. We release an early version of this project, namely FastSDXL. A try for building a diffusion engine.
Aug 29, 2023. We propose DiffSynth, a video synthesis framework.
git clone https://github.com/modelscope/DiffSynth-Studio.git
cd DiffSynth-Studio
pip install -e .
The Python examples are in examples
. We provide an overview here.
We trained an extended video synthesis model, which can generate 128 frames. examples/ExVideo
https://github.com/modelscope/DiffSynth-Studio/assets/35051019/d97f6aa9-8064-4b5b-9d49-ed6001bb9acc
Generate high-resolution images, by breaking the limitation of diffusion models! examples/image_synthesis
512*512 | 1024*1024 | 2048*2048 | 4096*4096 |
---|---|---|---|
1024*1024 | 2048*2048 |
---|---|
Render realistic videos in a flatten style and enable video editing features. examples/Diffutoon
https://github.com/Artiprocher/DiffSynth-Studio/assets/35051019/b54c05c5-d747-4709-be5e-b39af82404dd
https://github.com/Artiprocher/DiffSynth-Studio/assets/35051019/20528af5-5100-474a-8cdc-440b9efdd86c
Video stylization without video models. examples/diffsynth
https://github.com/Artiprocher/DiffSynth-Studio/assets/35051019/59fb2f7b-8de0-4481-b79f-0c3a7361a1ea
Use Hunyuan-DiT to generate images with Chinese prompts. We also support LoRA fine-tuning of this model. examples/hunyuan_dit
Prompt: 少女手捧鲜花,坐在公园的长椅上,夕阳的余晖洒在少女的脸庞,整个画面充满诗意的美感
1024x1024 | 2048x2048 (highres-fix) |
---|---|
Prompt: 一只小狗蹦蹦跳跳,周围是姹紫嫣红的鲜花,远处是山脉
Without LoRA | With LoRA |
---|---|
python -m streamlit run DiffSynth_Studio.py
https://github.com/Artiprocher/DiffSynth-Studio/assets/35051019/93085557-73f3-4eee-a205-9829591ef954