Closed ryanleh closed 3 years ago
For this demo, during setup you'll need to
1) Modify demo/python/multiclient-cluster-remote-control/hosts.config
to contain the IP addresses of all nodes in the cluster. To run locally with one node, edit this file to contain only 127.0.0.1:22
.
2) Give SSH access to all nodes in the cluster to the server machine (i.e. the node you modified hosts.config
on).
3) Start the RPC servers on all machines by running secure-xgboost/host/dmlc-core/tracker/dmlc-submit --cluster ssh --host-file hosts.config --num-workers <num_workers_in_cluster> --worker-memory 4g python3 server/enclave_serve.py
. This command has been compacted into a script, demo/python/multiclient-cluster-remote-control/run-distributed.sh
, that takes in an argument, the number of nodes in the cluster.
I think the issue is that you didn't run this command in step 3, but instead ran only python3 server/enclave_serve.py
.
When running the multiclient example locally (in different terminals) I'm receiving the following errors:
Server
Orchestrator
Client 1
Client 2