Closed TracebaK closed 11 months ago
Currently, KubeRay only supports Ray Serve by RayService CRD. I am considering also exposing a K8s service for all the head and workers after the v1.0.0 release. I believe there is an issue to tracking the progress. Close this issue. Feel free to follow up if you have any further questions.
Search before asking
KubeRay Component
ray-operator
What happened + What you expected to happen
Hi,
I'm using kuberay to run a NodePort ray cluster service on a local k8s cluster and there is a ray serve application running on the ray cluster. Since ray starts a httpproxy on each node, I want the incoming requests to somehow equally assigned to each node and then route to a selected serve replica. However, I noticed that the raycluster service is only selecting the head pod, which means all throughput are first sent to the head pod's httpproxy and that lead to a overall low throughput of the service. I read the official documentation and know about the adding a ingress as the load balancer stuff. But I'm still wondering even without an ingress, k8s should be able to perform the load balance in a random or round-robin manner and ease the head node httpproxy bottleneck issue. Below is the config of the service, the last two selector filtered all worker nodes.
KubeRay version: 0.5.0 Ray version: 2.6.3
I'm a beginner to k8s, sorry if this is a stupid question and possibly it's not a bug. I'm also glad to know your thought on the designing of the selectors. Thank you!
Reproduction script
Anything else
No response
Are you willing to submit a PR?