Phala-Network / solo-mining-scripts

Apache License 2.0
32 stars 26 forks source link

`phala sgx-test` fails with `SGX_ERROR_NETWORK_FAILURE` #59

Closed athei closed 2 years ago

athei commented 2 years ago

Hardware: Intel i9-9900k Asrock Z390 Phantom Gaming 6

Kernel: 5.17.7

Admitted, this is a non standard system but I installed the dependencies manually: Docker, Kernel module.

When running phala sgx-test I get the following error. It takes a few seconds before it comes back with the error after trying the remote attestation. I guess it is some network timeout. Can you disclose IP:Port it is trying to connect to. Would it make easier to debug the network traffic.

thread '<unnamed>' panicked at 'error while doing remote attestation: SGX_ERROR_NETWORK_FAILURE', src/lib.rs:448:132

The full output:

Sleep 12s
aesm_service[15]: [get_qpl_handle ../qe_logic.cpp:263] Cannot open Quote Provider Library libdcap_quoteprov.so.1 and libdcap_quoteprov.so

aesm_service[15]: The server sock is 0x563566a312c0
Detecting SGX, this may take a minute...
aesm_service[15]: Malformed request received (May be forged for attack)
aesm_service[15]: InKernel LE loaded
aesm_service[15]: InKernel LE loaded
aesm_service[15]: InKernel LE loaded
aesm_service[15]: InKernel LE loaded
aesm_service[15]: InKernel LE loaded
aesm_service[15]: InKernel LE loaded
aesm_service[15]: InKernel LE loaded
aesm_service[15]: InKernel LE loaded
✔  SGX instruction set
  ✔  CPU support
  ✔  CPU configuration
  ✔  Enclave attributes
  ✔  Enclave Page Cache
  SGX features
    ✘  SGX2  ✘  EXINFO  ✘  ENCLV  ✘  OVERSUB  ✘  KSS  
    Total EPC size: 93.5MiB
✔  Flexible launch control
  ✔  CPU support
  ? CPU configuration
  ✔  Able to launch production mode enclave
✔  SGX system software
  ✔  SGX kernel device (/dev/sgx/enclave)
  ✔  libsgx_enclave_common
  ✔  AESM service
  ✔  Able to launch enclaves
    ✔  Debug mode
    ✔  Production mode
    ✔  Production mode (Intel whitelisted)

You're all set to start running SGX programs!
Generated machine id:
[censored]

CPU Cores:
16

Encoded runtime info:
[censored]
Testing RA...
aesm_service[15]: [ADMIN]EPID Provisioning initiated
aesm_service[15]: [ADMIN]EPID Provisioning failed due to network error
aesm_service[15]: SGX EPID provisioning network failure
thread '<unnamed>' panicked at 'error while doing remote attestation: SGX_ERROR_NETWORK_FAILURE', src/lib.rs:448:132
note: Call backtrace::enable_backtrace with 'PrintFormat::Short/Full' for a backtrace.
fatal runtime error: failed to initiate panic, error 5
./start_sgx_detect.sh: line 20:    41 Illegal instruction     (core dumped) ./app
----------------------------------------------------------------------------------------------------
 [ phala_scripts_check_sgxtest ] unknown error!
----------------------------------------------------------------------------------------------------

Is the source of the sgx-test binary available somewhere?

jasl commented 2 years ago

here https://github.com/Phala-Network/ra-test/blob/master/enclave/src/lib.rs

for the error, I guess it's IAS issue, the AESMD (a daemon for supporting some SGX features) requires connect to Intel Attestation Service to fetch the report, if it has issue, the SGX app would get the error

this is the proof

aesm_service[15]: [ADMIN]EPID Provisioning initiated
aesm_service[15]: [ADMIN]EPID Provisioning failed due to network error
aesm_service[15]: SGX EPID provisioning network failure

Can you confirm api.trustedservices.intel.com is reachable from your PC?

athei commented 2 years ago

It seems to be reachable:

wget https://api.trustedservices.intel.com/
--2022-05-17 16:25:30--  https://api.trustedservices.intel.com/
Loaded CA certificate '/etc/ssl/certs/ca-certificates.crt'
Resolving api.trustedservices.intel.com (api.trustedservices.intel.com)... 40.87.90.88
Connecting to api.trustedservices.intel.com (api.trustedservices.intel.com)|40.87.90.88|:443... connected.
HTTP request sent, awaiting response... 404 Resource Not Found
2022-05-17 16:25:31 ERROR 404: Resource Not Found.

However, this stuff runs inside the docker. Can't tell if it is reachable from there.

jasl commented 2 years ago

It seems to be reachable:

wget https://api.trustedservices.intel.com/
--2022-05-17 16:25:30--  https://api.trustedservices.intel.com/
Loaded CA certificate '/etc/ssl/certs/ca-certificates.crt'
Resolving api.trustedservices.intel.com (api.trustedservices.intel.com)... 40.87.90.88
Connecting to api.trustedservices.intel.com (api.trustedservices.intel.com)|40.87.90.88|:443... connected.
HTTP request sent, awaiting response... 404 Resource Not Found
2022-05-17 16:25:31 ERROR 404: Resource Not Found.

However, this stuff runs inside the docker. Can't tell if it is reachable from there.

It returns 404 so I guess it's connectable (it block ICMP so you can't ping it), I remembered some of our miners has trouble on doing RA, probably the same problem, wait for awhile then retry or switch to another IP should work, if possible, you chan try these methods first, I'm contacting some miners I know for details

athei commented 2 years ago

I just run a shell inside your docker image:

docker run -ti phalanetwork/phala-sgx_detect /bin/sh

# wget api.trustedservices.intel.com
--2022-05-17 14:34:29--  http://api.trustedservices.intel.com/
Resolving api.trustedservices.intel.com (api.trustedservices.intel.com)... failed: Temporary failure in name resolution.
wget: unable to resolve host address 'api.trustedservices.intel.com'
# 

It seems networking doesn't work at all. It turns out that my firewall just drops forwarded packets. I adjusted the firewall config. Sorry for wasting your time :)