Open stwerner97 opened 1 year ago
I am not very familiar with the ray-launcher but I will try to take a stab here.
You mentioned the address http://localhost:8265
in the standalone command ray job submit --address http://localhost:8265 -- python script.py
while you use hydra.launcher.ray.init.address=localhost:6379
in the config. Shouldn't these be the same ?
Hi @shagunsodhani , thanks for responding! 😊
Yes, they should be the same, I made a copy-paste error when reporting the issue. The issue also occurs for hydra.launcher.ray.init.address=localhost:8265
. I've edited the initial issue report and corrected the port number.
🐛 Bug
Description
Using the Hydra Ray launcher, I want to submit the Simple Ray Launcher Example of Hydra to a remote Ray Kubernetes cluster. Launching the job via
python my_app.py --multirun hydra/launcher=ray hydra.launcher.ray.init.address=localhost:8265
, however, fails to connect to the GCS of Ray.I have configured the Ray launcher plugin to point towards the address of the (port-forwarded) Ray dashboard. I verified that I can successfully submit jobs using Ray, i.e.,
ray job submit --address http://localhost:8265 -- python script.py
is successful.Some information on the Ray Kubernetes cluster:
raycluster-kuberay-head
uses the imagerayproject/ray:2.3.0
.kuberay-operator
uses the imagekuberay/operator:v0.5.0
.raycluster-kuberay-head-svc
service has the following targetsapp.kubernetes.io/created-by=kuberay-operator
:<ip-address>:10001 10001/TCP
app.kubernetes.io/name=kuberay
:<ip-address>:6379 6379/TCP
ray.io/cluster=raycluster-kuberay
:<ip-address>:8265 8265/TCP
(this is the forwarded port)ray.io/identifier=raycluster-kuberay-head
:8080/TCP
ray.io/node-type=head
:<ip-address>:8000 8000/TCP
I didn't find too much information on this issue and am unsure whether this issue belongs to the Hydra or Ray repository.
Checklist
System information