mindspore-lab / mindone

one for all, Optimal generator with No Exception
Apache License 2.0
329 stars 63 forks source link

feat(diffusers/pipelines): add pipelines like ControlNet, T2I-Adapter, UnCLIP, Pixart, AnimateDiff and etc. #562

Open townwish4git opened 1 week ago

townwish4git commented 1 week ago

What does this PR do?

Image 1 Image 2 Image 3
PixArtAlphaPipeline AnimateDiffPipeline ShapEPipeline

What's New

🔥 Plenty of pipelines are available now.

New Pipelines

Usage Example

Most pipelines come with an EXAMPLE_DOC_STRING in Python files. Simply copy and run it to quickly give it a try. Dive in and experiment! Here's an example of AnimateDiffPipeline:

>>> import mindspore
>>> from mindone.diffusers import MotionAdapter, AnimateDiffPipeline, DDIMScheduler
>>> from mindone.diffusers.utils import export_to_gif
>>> adapter = MotionAdapter.from_pretrained("guoyww/animatediff-motion-adapter-v1-5-2", mindspore_dtype=mindspore.float16, use_safetensors=True)
>>> model_id = "SG161222/Realistic_Vision_V5.1_noVAE"
>>> pipe = AnimateDiffPipeline.from_pretrained(model_id, motion_adapter=adapter, mindspore_dtype=mindspore.float16, use_safetensors=True)
>>> scheduler = DDIMScheduler.from_pretrained(
...     model_id,
...     subfolder="scheduler",
...     clip_sample=False,
...     timestep_spacing="linspace",
...     beta_schedule="linear",
...     steps_offset=1,
... )
>>> pipe.scheduler = scheduler
>>> output = pipe(
...     prompt=(
...         "masterpiece, bestquality, highlydetailed, ultradetailed, sunset, "
...         "orange sky, warm lighting, fishing boats, ocean waves seagulls, "
...         "rippling water, wharf, silhouette, serene atmosphere, dusk, evening glow, "
...         "golden hour, coastal landscape, seaside scenery"
...     ),
...     negative_prompt="bad quality, worse quality",
...     num_frames=16,
...     guidance_scale=7.5,
...     num_inference_steps=25,
... )
>>> frames = output[0][0]
>>> export_to_gif(frames, "animation.gif")

Limitations

NPU Memory Limitation

Due to the Flash Attention feature in mindone.diffusers being available only after MindSpore version 2.3, the current environment with MindSpore version 2.2.10 may encounter insufficient NPU memory issues, particularly for pipelines such as video generation that involve long sequences length. As a result, we have lowered the image resolution in the example doc string of the I2VGenXLPipeline, which might impact the quality of video generation.

Static Graph Availability

The render model utilized by Shap-E, which incorporates flexible model architectures and input/output formats, might not be supported in static graph mode due to its design. Therefore, it is recommended to use dynamic graph models when running pipelines related to Shap-E to ensure compatibility and proper functionality.

What's more about this PR

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.

@geniuspatrick @CaitinZhao @SamitHuang