alpa-projects / mms

AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)
79 stars 12 forks source link

AlpaServe

Repo of alpa's multi-model serving system.

This is the official implementation of our OSDI'23 paper: AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving.

To reproduce all the main results in our paper, please check the artifact folder and follow the instructions in it.