ray-project / ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
32.17k stars 5.48k forks source link

[Core] Ray's random generated port doesn't exclude already used ports #46453

Open totoroyyb opened 2 weeks ago

totoroyyb commented 2 weeks ago

What happened + What you expected to happen

Starting ray instance with ray start ... without specifying the port number can lead to port conflicts. It looks like the randomly generated port numbers conflict with the worker ports. Ray internal port number random generation should exclude these used ports.

ValueError: Ray component worker_ports is trying to use a port number 18203 that is used by other components.

Port information: {'gcs': 'random', 'object_manager': 'random', 'node_manager': 'random', 'gcs_server': 'random', 'client_server': 10001, 'dashboard': 8265, 'dashboard_agent_grpc': 53136, 'dashboard_agent_http': 52365, 'dashboard_grpc': 'random', 'runtime_env_agent': 62091, 'metrics_export': 18203, 'redis_shards': 'random', 'worker_ports': '9998 ports from 10002 to 19999'}

Versions / Dependencies

ray, version 2.31.0

Reproduction script

Just any ray start can potentially cause this issue

Issue Severity

Medium: It is a significant difficulty but I can work around it.

Superskyyy commented 2 weeks ago

I will fix that.

totoroyyb commented 3 days ago

@Superskyyy Thanks for the prompt reply. If possible, could you please point me to the logic regarding the random port generation? I can see if I can get a quick PR for this.