BennyKok / comfyui-deploy

An open source `vercel` like deployment platform for Comfy UI
https://comfydeploy.ing
GNU Affero General Public License v3.0
911 stars 111 forks source link

Implement Modal's Keep Warm feature for faster inference #59

Open slavakurilyak opened 1 month ago

slavakurilyak commented 1 month ago

Description

It would be great to use keep_warm when creating Modal containers using ComfyDeploy. This feature allows Modal to maintain a pool of pre-warmed instances, which can reduce cold start times and improve the responsiveness of the serverless GPU inference for ComfyUI.

Current Situation

Currently, ComfyDeploy scales from zero, which means there is a noticeable delay when spinning up new containers for inference requests.

Proposed Solution

Implement Modal's keep_warm feature in the Modal function decorators used in ComfyDeploy. This can be done by modifying the comfyui-deploy/builder/modal-builder/src/template/app.py file.

Example Implementation

import modal

# ... existing imports ...

@stub.function(
    image=target_image,
    gpu=config["gpu"],
    allow_concurrent_inputs=100,
    concurrency_limit=1,
    timeout=10 * 60,
    keep_warm=3  # Add this line to keep 3 containers warm (suggested default)
)
@asgi_app()
def comfyui_app():
    # ... existing function body ...

Benefits

  1. Reduced cold start latency
  2. Faster response times for inference requests
  3. Improved user experience, especially for applications requiring quick responses

Considerations

omarei-omoto commented 3 weeks ago

yes please :)