ray-project / ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
34.22k stars 5.81k forks source link

[Core] Cannot 'ray list nodes' after setting the environmental variable 'export RAY_ADDRESS="http://127.0.0.1:8265" ' #28847

Open DicardoX opened 2 years ago

DicardoX commented 2 years ago

What happened + What you expected to happen

I first use the following command to create a ray head node:

After that, I can get the following node information with the command 'ray list nodes':

List: 2022-09-28 19:49:19.974051 Stats: Total: 1

Table:

NODE_ID NODE_IP STATE NODE_NAME RESOURCES_TOTAL 0 8b62297b1b1281778a7db24698aa00539e6247cb64a70e9c529e6828 [IP_ADDRESS] ALIVE [IP_ADDRESS] CPU: 48.0

GPU: 4.0 accelerator_type:TITAN: 1.0 memory: 80285796967.0 node:10.2.64.62: 1.0 object_store_memory: 38693912985.0

Then, I try to tell the Ray Jobs CLI how to find my Ray Cluster, I pass the Ray Dashboard address:

Unfortunately, when I run the 'ray list nodes' again, I just got the following repeated information:

Versions / Dependencies

Reproduction script

Error messages:

2022-09-28 19:36:46,554 WARNING utils.py:1333 -- Unable to connect to GCS at http://127.0.0.1:8265. Check that (1) Ray GCS with matching version started successfully at the specified address, and (2) there is no firewall setting preventing access. 2022-09-28 19:36:48,556 WARNING utils.py:1333 -- Unable to connect to GCS at http://127.0.0.1:8265. Check that (1) Ray GCS with matching version started successfully at the specified address, and (2) there is no firewall setting preventing access. 2022-09-28 19:36:50,559 WARNING utils.py:1333 -- Unable to connect to GCS at http://127.0.0.1:8265. Check that (1) Ray GCS with matching version started successfully at the specified address, and (2) there is no firewall setting preventing access. 2022-09-28 19:36:52,562 WARNING utils.py:1333 -- Unable to connect to GCS at http://127.0.0.1:8265. Check that (1) Ray GCS with matching version started successfully at the specified address, and (2) there is no firewall setting preventing access. 2022-09-28 19:36:54,565 WARNING utils.py:1333 -- Unable to connect to GCS at http://127.0.0.1:8265. Check that (1) Ray GCS with matching version started successfully at the specified address, and (2) there is no firewall setting preventing access.

Issue Severity

No response

architkulkarni commented 2 years ago

This should be fixed by https://github.com/ray-project/ray/pull/28643, I verified it works on the master branch. Can you try on the Ray nightly and see if it's fixed for you as well?