mc2-project / secure-xgboost

Secure collaborative training and inference for XGBoost.
https://mc2-project.github.io/secure-xgboost/
Apache License 2.0
105 stars 32 forks source link

multiclient secure-xgboost tutorial example yields error #142

Closed ryanleh closed 3 years ago

ryanleh commented 3 years ago

When running the multiclient example locally (in different terminals) I'm receiving the following errors:

Server

ryan@ryan-dev-vm:~/secure-xgboost/demo/python/multiclient-cluster-remote-control/server$ python3 enclave_serve.py
Waiting for client...
Rabit Module currently only work with dmlc worker, quit this program by exit 0
2021-04-08T05:19:29+0000.002183Z [(H)ERROR] tid(0x7f7a0effd700) | :OE_ENCLAVE_ABORTING [/source/host/calls.c:_call_enclave_function_impl:56]
2021-04-08T05:19:29+0000.002239Z [(H)ERROR] tid(0x7f7a0e7fc700) | :OE_ENCLAVE_ABORTING [/source/host/calls.c:_call_enclave_function_impl:56]
2021-04-08T05:19:29+0000.004897Z [(H)ERROR] tid(0x7f7a0effd700) | :OE_ENCLAVE_ABORTING [/source/host/calls.c:_call_enclave_function_impl:56]
Ecall failed: result=19 (OE_ENCLAVE_ABORTING)
Error type: <class 'securexgboost.core.XGBoostError'>
Error value:
  File "/usr/local/lib/python3.7/site-packages/securexgboost-0.1-py3.7.egg/securexgboost/remote_server.py", line 458, in rpc_get_remote_report_with_pubkey_and_nonce
    pem_key, key_size, nonce, nonce_size, client_list, client_list_size, remote_report, remote_report_size = self._serialize(remote_api.get_remote_report_with_pubkey_and_nonce, request)
  File "/usr/local/lib/python3.7/site-packages/securexgboost-0.1-py3.7.egg/securexgboost/remote_server.py", line 451, in _serialize
    ret = func(params)
  File "/usr/local/lib/python3.7/site-packages/securexgboost-0.1-py3.7.egg/securexgboost/core.py", line 2801, in get_remote_report_with_pubkey_and_nonce
    ctypes.byref(remote_report_size)))
  File "/usr/local/lib/python3.7/site-packages/securexgboost-0.1-py3.7.egg/securexgboost/core.py", line 203, in _check_call
    raise XGBoostError(py_str(_LIB.XGBGetLastError()))

Orchestrator

ryan@ryan-dev-vm:~/secure-xgboost/demo/python/multiclient-cluster-remote-control/orchestrator$ python3 start_orchestrator.py
Waiting for client...
Hello from the orchestrator!

Client 1

ryan@ryan-dev-vm:~/secure-xgboost/demo/python/multiclient-cluster-remote-control/client1$ ./run.sh 127.0.0.1
Remote attestation

Client 2

ryan@ryan-dev-vm:~/secure-xgboost/demo/python/multiclient-cluster-remote-control/cliryan@ryan-dev-vm:~/secure-xgboost/demo/python/multiclient-cluster-remote-control/client2$ ./run.sh 127.0.0.1
Remote attestation
Traceback (most recent call last):
    File "client2.py", line 77, in <module>
        run(channel_addr, str(args.symmkey), str(args.privkey), str(args.cert))
    File "client2.py", line 22, in run
        xgb.attest()
    File "/usr/local/lib/python3.7/site-packages/securexgboost-0.1-py3.7.egg/securexgboost/core.py", line 2641, in attest
        response = _check_remote_call(stub.rpc_get_remote_report_with_pubkey_and_nonce(remote_pb2.Status(status=1)))
    File "/usr/local/lib/python3.7/site-packages/securexgboost-0.1-py3.7.egg/securexgboost/core.py", line 187, in _check_remote_call
        raise XGBoostError(ret.status.exception)
chester-leung commented 3 years ago

For this demo, during setup you'll need to

1) Modify demo/python/multiclient-cluster-remote-control/hosts.config to contain the IP addresses of all nodes in the cluster. To run locally with one node, edit this file to contain only 127.0.0.1:22. 2) Give SSH access to all nodes in the cluster to the server machine (i.e. the node you modified hosts.config on). 3) Start the RPC servers on all machines by running secure-xgboost/host/dmlc-core/tracker/dmlc-submit --cluster ssh --host-file hosts.config --num-workers <num_workers_in_cluster> --worker-memory 4g python3 server/enclave_serve.py. This command has been compacted into a script, demo/python/multiclient-cluster-remote-control/run-distributed.sh, that takes in an argument, the number of nodes in the cluster.

I think the issue is that you didn't run this command in step 3, but instead ran only python3 server/enclave_serve.py.