skypilot-org / skypilot

SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
https://skypilot.readthedocs.io
Apache License 2.0
6.82k stars 512 forks source link

[Serve] Enable multiple ports in SkyServe replicas #4356

Open Conless opened 1 week ago

Conless commented 1 week ago

Current implementation of SkyServe only allows the replicas to expose one port. In some cases, services may need to expose multiple ports for custom controller, GUI interface, etc. This PR adds support for it by allowing multiple ports, but use the first one as the main port. For example, when we start a service with resource requirements:

resources:
  ports:
  - 8080-8081
  - 10000
  cpus: 2

The output of sky serve status will look like:

Services
NAME              VERSION  UPTIME  STATUS  REPLICAS  ENDPOINT            
sky-service-xxxx  1        28s     READY   2/2       xxxx:30001  

Service Replicas
SERVICE_NAME      ID  VERSION  ENDPOINT             LAUNCHED     RESOURCES       STATUS  REGION     
sky-service-xxxx  1   1        http://xxxx:8080     50 secs ago  1x AWS(vCPU=2)  READY   us-east-1  
sky-service-xxxx  2   1        http://xxxx:8080     1 min ago    1x AWS(vCPU=2)  READY   us-east-1 

while the other ports (8081, 10000) are still accessible.

Tested (run the relevant ones):

Conless commented 1 week ago

Hi @cblmemo ! Would you like to have a look at this?

cblmemo commented 6 days ago

Can we also update the PR description?

cblmemo commented 6 days ago

Also, it will be great if we can test on some real world usage (e.g. deploy an LLM service and expose metric port in vLLM or dashboard in ray