kserve / modelmesh

Distributed Model Serving Framework
Apache License 2.0
154 stars 63 forks source link

how about using modelmesh to serve thousands of stable diffusion models #95

Closed Jack47 closed 9 months ago

Jack47 commented 1 year ago

I want to use modelmesh to serving thousands of stable diffusions models. Any advice would be appreciated~

  1. I'm using triton as serving runtime. Inference time is about 3~10s.
  2. I'm using ensemble in triton to leverage business logics like audit and watermark, maybe they can be standalone service in the future
  3. currently every model have it's own k8s service and ingress rules.

Goals:

  1. make higher cluster resources utilization, especially GPU
  2. make every inference request latency as fast as possible
ckadner commented 10 months ago

@Jack47 -- were you able to use ModelMesh-Serving for your stable diffusion models? Did you run into any specific issues?

WikiPedia thinks it should look like this :-)

image

Jack47 commented 9 months ago

currently we don't use modelmesh, thanks for your response.