ray-project / kuberay

A toolkit to run Ray applications on Kubernetes
Apache License 2.0
982 stars 330 forks source link

[Chore] Run operator outside the cluster #2090

Closed MortalHappiness closed 2 months ago

MortalHappiness commented 2 months ago

Why are these changes needed?

Running the operator outside the cluster allows developers to set breakpoints in their IDEs for debugging purposes.

image

Related issue number

N/A

Steps to Reproduce

  1. Create a Kubernetes cluster by kind create cluster --image=kindest/node:v1.24.0 or k3d cluster create or minikube start
  2. make -C ray-operator install
  3. make -C ray-operator build
  4. ./ray-operator/bin/manager -leader-election-namespace default -use-kubernetes-proxy
  5. make -C ray-operator test-e2e

Checks

andrewsykim commented 2 months ago

Another issue to check is that the KubeRay operator needs to communicate with the Ray head pod in RayJob or RayService CRs. Running it outside the Kubernetes cluster may cause some issues.

If you run into this issue, you can try enabling --use-kubernetes-proxy (only available in nightly build)

MortalHappiness commented 2 months ago

@kevin85421 @andrewsykim I successfully run the e2e test with some code modifications. Please see the Steps to Reproduce section in the PR description. Thanks.

andrewsykim commented 2 months ago

Overall LGTM, glad to see this is possible with the current make targets