Lightning-Universe / stable-diffusion-deploy

Learn to serve Stable Diffusion models on cloud infrastructure at scale. This Lightning App shows load-balancing, orchestrating, pre-provisioning, dynamic batching, GPU-inference, micro-services working together via the Lightning Apps framework.
https://lightning.ai/muse
Apache License 2.0
392 stars 39 forks source link
model-serving stable-diffusion


Use AI to inspire your art

Muse is live here

Run your ownLightning AIUse Muse LiveFull Tutorial

[![ReadTheDocs](https://readthedocs.org/projects/pytorch-lightning/badge/?version=stable)](https://lightning.ai/lightning-docs/) [![Slack](https://img.shields.io/badge/slack-chat-green.svg?logo=slack)](https://www.pytorchlightning.ai/community) [![license](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://github.com/Lightning-AI/lightning/blob/master/LICENSE)

Muse

Open source, stable-diffusion production server to show how to deploy diffusion models in a real production environment with: load-balancing, gpu-inference, performance-testing, micro-services orchestration and more. All handled easily with the Lightning Apps framework.

The app is live here.

Full tutorial on how to build this app.

image

Model

Muse uses the opensource Stable Diffusion model made available by stability AI. We apply a few fancy tricks to make the inference super fast.

Here's a small snippet showing our model server

    @torch.inference_mode()
    def predict(self, dreams: List[Data], entry_time: int):
        # handle timeout
        if time.time() - entry_time > INFERENCE_REQUEST_TIMEOUT:
            raise TimeoutException()

        # sets up the inference settings
        height = width = IMAGE_SIZE
        num_inference_steps = 50 if dreams[0].high_quality else 25

        prompts = [dream.prompt for dream in dreams]

        # GPU inference
        if torch.cuda.is_available():
            with autocast("cuda"):
                torch.cuda.empty_cache()
                pil_results = self._model(
                    prompts,
                    height=height,
                    width=width,
                    num_inference_steps=num_inference_steps,
                )
            # apply filter
            nsfw_content = self._safety_checker(pil_results)
            for i, nsfw in enumerate(nsfw_content):
                if nsfw:
                    pil_results[i] = Image.open("assets/nsfw-warning.png")
        else:
            time.sleep(3)
            pil_results = [Image.fromarray(np.random.randint(0, 255, (height, width, 3), dtype="uint8"))] * len(prompts)

        # return the model
        results = []
        for image in pil_results:
            buffered = BytesIO()
            image.save(buffered, format="PNG")
            img_str = base64.b64encode(buffered.getvalue()).decode("utf-8")
            # make sure pil_results is a single item array or it'll rewrite image
            results.append({"image": f"data:image/png;base64,{img_str}"})

        return results

Run your own

To run this app locally, follow these steps:

conda create --name muse_app python=3.9 --yes
conda activate muse_app

git clone https://github.com/Lightning-AI/stable-diffusion-deploy.git
cd stable-diffusion-deploy

bash dev_install.sh

## To run the app locally
python -m lightning run app app.py

## To run the app on the cloud to share it with your peers and users
python -m lightning run app app.py --cloud

You can configure Muse to select customize number of workers, batch size or select the stable diffusion version of your choice. Learn more.

About this Lightning App

Muse is a blueprint for building diffusion-based production systems with Lightning AI. This app shows you how to:

Architecture Diagram of Muse -

arch diagram

Slackbot

How to integrate Muse to a Slack Workspace
You can integrate this app in your Slack Workspace and send images in Slack channels. This app uses the [Slack Command Bot Component](https://github.com/Lightning-AI/LAI-slack-command-bot-Component) to interact with Slack commands. You can also check out this video tutorial:
### How to create the Slack Command Bot **Step 1:** Go to https://api.slack.com and create an app. **Step 2:** Copy the following tokens and secrets from the Slack API settings by going to https://api.slack.com/apps. These tokens have to be passed either as argument or environment variable to [SlackCommandBot](https://github.com/Lightning-AI/LAI-slack-command-bot-Component/blob/main/slack_command_bot/component.py#L18) class.
Required Token name and environment variables: - Client ID (SLACK_CLIENT_ID) - Client Secret (CLIENT_SECRET) - Signing Secret (SIGNING_SECRET) - Bot User OAuth Token (BOT_TOKEN) - App-Level Token (SLACK_TOKEN)
**Step 3:** Implement the `SlackCommandBot.handle_command(...)` method the way you want to interact with the commands. The return value will be shown only to you. > ![](./assets/slack-ss.png) **Step 4:** (optional) If you want your slack app to be distributable to public then you need to implement `SlackCommandBot.save_new_workspace(...)` which should save `team_id` and its corresponding `bot_token` into a database. During the `handle_command(...)` method you will need to fetch `bot_token` based on the received `team_id`.