skypilot-org / skypilot

SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
https://skypilot.readthedocs.io
Apache License 2.0
6.82k stars 512 forks source link

[SkyServe] : API Authentication Options, HTTPS, More Stable Web Server that http serve #3360

Open jithinsarath opened 8 months ago

jithinsarath commented 8 months ago

I've combed thru the docs to try and find an answer, without any luck.

  1. Do we have the ability to use a production grade HTTP Server instead of python's in-built one?
  2. Can we implement HTTPS?
  3. When exposing APIs while serving, can we implement authentication?

If a rogue actor gets hold of the IP:Port of a running instance, the costs could go up significantly if there's no auth

I do not expect the project team to solve these, but of there are some directions given, I could take a shot at it.

concretevitamin commented 8 months ago

Thanks for raising this @jithinsarath.

  1. SkyServe uses Uvicorn at the moment for the controller / load balancer servers. It has not shown up as bottlenecks as of yet due to the compute-heavy nature of GenAI models. Do you foresee for your use cases this may become a bottleneck?
  2. We'd love to get community's help on this! Encrypting the requests/responses will be great to see in SkyServe.
  3. Yes, it should be possible. There are two prototypes of this based on Nginx, from @cblmemo:

We'd like to know more about your requirements to figure out the best ways forward :) Happy to follow up here or on Slack.

cblmemo commented 5 months ago

For 3, we recently added an example on authorization leveraging the underlying serving engine; pls check here :)) Does that suit your need?

github-actions[bot] commented 1 month ago

This issue is stale because it has been open 120 days with no activity. Remove stale label or comment or this will be closed in 10 days.

cblmemo commented 1 month ago

Quick update: for 2, we have an ongoing PR #3380 and we plan to merge it soon ;) for 1, recently we are exploring the possibility to adopy envoy proxy as our load balancer. Stay tuned!