Open JoshKarpel opened 4 months ago
@JoshKarpel Could you create a separate issue for the asynchronous versions of these APIs, and separate this issue out into optimizations that can be made to the synchronous versions of the APIs?
@JoshKarpel Could you create a separate issue for the asynchronous versions of these APIs, and separate this issue out into optimizations that can be made to the synchronous versions of the APIs?
Can do!
Description
serve.get_app_handle()
andserve.get_deployment_handle()
and their underlying methodServeControllerClient.get_handle()
allow users to dynamically get a handle to a Serve Deployment (either the ingress deployment of an app, or a specific deployment, depending on which API you use).These methods involve either 1 or 2 network calls to the Serve Controller to gather information, but those calls are done synchronously (
ray.get(...)
), which makes them inefficient to use in asynchronous code such as a FastAPI Deployment acting as a dynamic ingress to other deployments. Providingasync
variants of these functions would be a useful feature forasync
callers.I would be happy to make these changes, though I think I would need some guidance on naming conventions and whatnot :)
Use case
See discussion at https://ray-distributed.slack.com/archives/CNCKBBRJL/p1713194071772759 for more details about our use case, but TLDR we create handles dynamically at runtime and noticed it was blocking other requests in our FastAPI app.