Nexus is a scalable and efficient serving system for DNN applications on GPU cluster.
See BUILDING.md for details.
We provide a Docker image so that you can try Nexus quickly. And there is an example that goes step by step on how to run Nexus with a simple example application. We recommend you to take a look here.
Nexus publishes public model zoo on our department-hosted GitLab. To download, you need to install Git LFS first. Then, run:
git clone https://gitlab.cs.washington.edu/syslab/nexus-models
cd nexus-models
git lfs checkout
Nexus is a profile-based system. So before running Nexus, make sure you have profiled all the GPUs. To profile a certain model on a certain GPU, run:
nexus/tools/profiler/profiler.py --gpu_list=GPU_INDEX --gpu_uuid \
--framework=tensorflow --model=MODEL_NAME \
--model_root=nexus-models/ --dataset=/path/to/datasets/
The profile will be saved to the --model_root
directory.
See examples for more concrete usage.
To run Nexus, you need to run the scheduler first, then spawn a backend for each GPU card, and finally run the Nexus frontend of your application. See examples for more concrete usage.