Open source, stable-diffusion production server to show how to deploy diffusion models in a real production environment with: load-balancing, gpu-inference, performance-testing, micro-services orchestration and more. All handled easily with the Lightning Apps framework.
Full tutorial on how to build this app.
Muse uses the opensource Stable Diffusion model made available by stability AI. We apply a few fancy tricks to make the inference super fast.
Here's a small snippet showing our model server
@torch.inference_mode()
def predict(self, dreams: List[Data], entry_time: int):
# handle timeout
if time.time() - entry_time > INFERENCE_REQUEST_TIMEOUT:
raise TimeoutException()
# sets up the inference settings
height = width = IMAGE_SIZE
num_inference_steps = 50 if dreams[0].high_quality else 25
prompts = [dream.prompt for dream in dreams]
# GPU inference
if torch.cuda.is_available():
with autocast("cuda"):
torch.cuda.empty_cache()
pil_results = self._model(
prompts,
height=height,
width=width,
num_inference_steps=num_inference_steps,
)
# apply filter
nsfw_content = self._safety_checker(pil_results)
for i, nsfw in enumerate(nsfw_content):
if nsfw:
pil_results[i] = Image.open("assets/nsfw-warning.png")
else:
time.sleep(3)
pil_results = [Image.fromarray(np.random.randint(0, 255, (height, width, 3), dtype="uint8"))] * len(prompts)
# return the model
results = []
for image in pil_results:
buffered = BytesIO()
image.save(buffered, format="PNG")
img_str = base64.b64encode(buffered.getvalue()).decode("utf-8")
# make sure pil_results is a single item array or it'll rewrite image
results.append({"image": f"data:image/png;base64,{img_str}"})
return results
To run this app locally, follow these steps:
conda create --name muse_app python=3.9 --yes
conda activate muse_app
git clone https://github.com/Lightning-AI/stable-diffusion-deploy.git
cd stable-diffusion-deploy
bash dev_install.sh
## To run the app locally
python -m lightning run app app.py
## To run the app on the cloud to share it with your peers and users
python -m lightning run app app.py --cloud
You can configure Muse to select customize number of workers, batch size or select the stable diffusion version of your choice. Learn more.
Muse is a blueprint for building diffusion-based production systems with Lightning AI. This app shows you how to:
Architecture Diagram of Muse -