skypilot-org / skypilot

SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
https://skypilot.readthedocs.io
Apache License 2.0
6.74k stars 499 forks source link

SkyServe: Allow to configure resource spec for Serve Controller VM #3914

Closed yannickschuchmann closed 1 month ago

yannickschuchmann commented 1 month ago

I'm running a basic managed cluster and for me it's really important to have it as cost-effective as possible. I've found the following in the code:

# Due to the CPU/memory usage of the controller process launched with a job on
# controller VM (use ray job under the hood), we need to reserve some CPU/memory
# for each serve controller process.
# Serve: A default controller with 4 vCPU and 16 GB memory can run up to 16
# services.
CONTROLLER_MEMORY_USAGE_GB = 1.0

Considering that I will probably only run 1 service and maybe 1 job, a controller instance with 16GB ram and 4 vCPUs are overkill. Would be amazing to also specify the resource spec for the serve controller. Or is it possible already and I just didn't find it in the docs?

Thanks :)

Stealthwriter commented 1 month ago

you can do this

Customizing SkyServe controller resources You may want to customize the resources of the SkyServe controller for several reasons:

Use a lower-cost controller. (if you have a few services running)

Enforcing the controller to run on a specific location. This is particularly useful when you want the service endpoint within specific geographical region. (Default: cheapest location)

Changing the maximum number of services that can be run concurrently, which is the minimum number between 4x the vCPUs of the controller and the memory in GiB of the controller. (Default: 16)

Changing the disk_size of the controller to store more logs. (Default: 200GB)

To achieve the above, you can specify custom configs in ~/.sky/config.yaml with the following fields:

serve:

NOTE: these settings only take effect for a new SkyServe controller, not if

you have an existing one.

controller: resources:

All configs below are optional.

  # Specify the location of the SkyServe controller.
  cloud: gcp
  region: us-central1
  # Specify the maximum number of services that can be run concurrently.
  cpus: 2+  # number of vCPUs, max concurrent services = min(4 * cpus, memory in GiB)
  # Specify the disk_size in GB of the SkyServe controller.
  disk_size: 1024

The resources field has the same spec as a normal SkyPilot job; see here.

Stealthwriter commented 1 month ago

https://skypilot.readthedocs.io/en/latest/serving/sky-serve.html#skyserve-cont

cblmemo commented 1 month ago

Hi! This might be what you want: https://skypilot.readthedocs.io/en/latest/serving/sky-serve.html#customizing-skyserve-controller-resources

Also thanks for @Stealthwriter for providing the information!

Please let me know if that works for you ;)

yannickschuchmann commented 1 month ago

Hey @cblmemo ! Yes it solves the issue. Sorry could have closed the issue already. Thank you 🙏