ucbrise / clipper

A low-latency prediction-serving system
http://clipper.ai
Apache License 2.0
1.4k stars 280 forks source link

Managing Compute Resources for Containers #747

Open jeffjunzhang opened 5 years ago

jeffjunzhang commented 5 years ago

(1) In clipper, when we deploy a model container, e.g., a trained pytorch model, how do we set the CPU and memory limits for it? Is this supported by Clipper admin? If not, any solution to achieve this goal? (2) Even when we start the clipper cluster, can we set a CPU/memory limit on the whole cluster? Also, is there a way to set the CPU/memory limits for query_frontend and mange_frontend as well?

Thank you for your help!

withsmilo commented 5 years ago

Hi, @jeffjunzhang . Currently clipper_admin does not support it. So our team modified the following yaml files manually for the purpose you mentioned.

I will try to add a new feature to manage k8s resources to Clipper. Thanks for your good point.

jeffjunzhang commented 5 years ago

Thank you, @withsmilo Btw, if I'm using the docker container manager (DockerContainerManager()), which files should I manually modify?

After the modification, we should re-compile all the source codes from scratch, right?

withsmilo commented 5 years ago

@jeffjunzhang , Are you testing Clipper on your local PC? If on mac, see https://docs.docker.com/docker-for-mac/#advanced.

jeffjunzhang commented 5 years ago

@withsmilo Thanks again. I'm testing on a local 24-core server running a Ubuntu.

Basically, say if I have 1 pytorch model to deploy with Clipper, I want to limit the model container with 10 cores+ some memory, and also 10 cores for the query front-end container and management container.

What's the simplest way to achieve these?

withsmilo commented 5 years ago

@jeffjunzhang , Clipper is using DockerContainerManager for test only, not production. How about using docker update command to update the resource limit of some containers after deploying Clipper and your models? See https://docs.docker.com/engine/reference/commandline/update/. I think this is the simplest way.