Open surenyufuz opened 1 year ago
Hi @surenyufuz, thank you for raising this issue! Is it possible to use the default port 6379 as a workaround before the Ray community fixes this issue?
Thanks for attention, maybe I will not use autoscaler on kubernetes until this issue is fixed. I have to use random port with hostnetwork for some situations.
As is recommended to use autoscaler for GPU workloads, I expect this problem to be resolved, thanks a lot.
Also running into this same issue. Unfortunately, in our case, we are not able to use the default port 6379, as this is also the default Redis port, and we have some special routing configs for that port that's incompatible with Ray.
What happened + What you expected to happen
I have deployed a ray cluster on kubernetes and specify the port "61379" instead of "6379" in ray start params.
It appears that the head service works well.
And the workers could connect to the head node successfully.
But the autoscaler container in the head node encountered the exception as following:
It seems like that the autoscaler can not know the overwritten port number.
Versions / Dependencies
ray version: 2.7.1 Python version: 3.8.13
Reproduction script
Issue Severity
High: It blocks me from completing my task.