mars-project / mars

Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and Python functions.
https://mars-project.readthedocs.io
Apache License 2.0
2.68k stars 325 forks source link

[BUG]How to set the ip address in the cluster mode #3347

Closed PaulWongDlut closed 1 year ago

PaulWongDlut commented 1 year ago

Setting: This is my first time using Mars and I face some problems maybe pretty easy for you.

I am using mars as distributed computation engine. when i set the cluster mode as the doc said. I am confused about the ip address here:

mars-worker -H -p -s : mars.new_session('http://:') (from https://mars-project.readthedocs.io/zh_CN/latest/installation/deploy.html)

Question:

  1. does the ip means external network ip address or internal network?
  2. I tried the external network ip address and it failed.
  3. when I use the internal network ip address, and I am using VSCode on my Mac and remote-develop on a Linux system, how to open the web browser of the Mars WebUI on my Mac with the Linux internal network ip address?
zhongchun commented 1 year ago

Thanks for your question. The web_ip in mars.new_session('http://<web_ip>:<web_port>') is the ip address of the supervisor node. The web_port is the port you launched supervisor by setting -w. For example, we can launch a supervisor as follow:

mars-supervisor -H 11.122.207.65 -p 7001 -w 9001 -s 11.122.207.65:9001 --log-level info --log-format "%(asctime)s %(levelname)s %(filename)s:%(lineno)d - %(message)s"

11.122.207.65 is the ip address of my node. Then we can see the log:

2023-05-12 10:58:16,555 INFO api.py:135 - Metric mars.operand.group_keys_num will be initialized when invoking `init_metrics()`.
2023-05-12 10:58:16,856 INFO api.py:73 - Finished initialize the metrics of backend: console.
2023-05-12 10:58:17,334 INFO api.py:135 - Metric mars.operand.group_rows_num will be initialized when invoking `init_metrics()`.
2023-05-12 10:58:17,334 INFO api.py:135 - Metric mars.operand.group_keys_num will be initialized when invoking `init_metrics()`.
2023-05-12 10:58:17,645 INFO api.py:73 - Finished initialize the metrics of backend: console.
2023-05-12 10:58:17,672 INFO fury.py:729 - Created fury serializer <pyfury._serialization.Fury object at 0x7fd485524d60> for thread 140551938828096.
2023-05-12 10:58:17,698 INFO fury.py:729 - Created fury serializer <pyfury._serialization.Fury object at 0x7fb0ecac57e0> for thread 140398657570624.
2023-05-12 10:58:17,727 INFO api.py:191 - Initializing a gauge whose name: mars.scheduling.request_worker_time_sec, tag keys: None, backend: console
2023-05-12 10:58:17,728 INFO api.py:191 - Initializing a gauge whose name: mars.scheduling.offline_worker_time_sec, tag keys: None, backend: console
2023-05-12 10:58:17,729 INFO supervisor.py:73 - Mars Web access log level is set to WARNING.
2023-05-12 10:58:17,732 INFO supervisor.py:90 - Mars Web started at 11.122.207.65:9001

And the 11.122.207.65:9001 is the web address.

First of all, you should ensure that the network is accessible, if you want to access the Mars Web service in your remote Linux. You can verify this by ping or telnet.

I don't understand your external network ip and internal network ip. You can execute the python code in your remote Linux to ensure the Mars is launched or not.

zhongchun commented 1 year ago

@PaulWongDlut any questions? If not, i'll close this issue.

zhongchun commented 1 year ago

@PaulWongDlut I'll close this issue. We welcome open an issue at any time if there is any problem.