Closed simplew2011 closed 3 months ago
cc @jacobtomlinson maybe you have time to help here
--ipc=host --network host
in every docker run, process_data.py is run ok2024-05-31 08:08:34,530 - distributed.scheduler - INFO - Scheduler at: tcp://192.168.30.161:8786
2024-05-31 08:08:34,530 - distributed.scheduler - INFO - dashboard at: http://192.168.30.161:8787/status
2024-05-31 08:10:16,884 - distributed.scheduler - INFO - Register worker <WorkerState 'tcp://192.168.20.139:43259', status: init, memory: 0, processing: 0>
2024-05-31 08:10:17,174 - distributed.scheduler - INFO - Starting worker compute stream, tcp://192.168.20.139:43259
2024-05-31 08:10:17,175 - distributed.core - INFO - Starting established connection to tcp://192.168.20.139:43478
2024-05-31 08:10:17,178 - distributed.scheduler - INFO - Register worker <WorkerState 'tcp://192.168.20.139:37555', status: init, memory: 0, processing: 0>
Dask workers connect to each other on random high ports, so when you run the workers without any network or port settings they cannot talk to each other.
Setting --network host
will work because it removes network isolation. Alternatively you can create a dedicated docker network which will allow all the containers to talk to each other, this is what we do in our documentation examples.
docker network create dask
docker run --network dask -p 8787:8787 --name scheduler ghcr.io/dask/dask dask scheduler # start scheduler
docker run --network dask ghcr.io/dask/dask dask worker scheduler:8786 # start worker
docker run --network dask ghcr.io/dask/dask dask worker scheduler:8786 # start worker
docker run --network dask ghcr.io/dask/dask dask worker scheduler:8786 # start worker
Following these steps everything works and I am unable to reproduce the dashboard issue with the latest version of the ghcr.io/dask/dask
image.
I have three machines, one scheduler machines and two worker machines
scheduler machines (ip: 192.168.30.161),start scheduler
worker1 machines (ip: 192.168.20.139),start worker
worker2 machines (ip: 192.168.25.141),start worker
scheduler machines,start dask to load data and process
process_data.py
error in scheduler
and some error occur in worker:
and dashboard is empty, can not found status and workers info
if using a LocalCluster, process_data.py is run ok in docker
If I don't use Docker, starting the dask scheduler and two workers using console commands on different machines, and then running the business code is also okay
Is there any problem with my dockers startup command?
--interface lo
in docker run, some error; I'm not sure what parameters the--host
needs to setifconfig
in three machines, find the same network interfacelo