aws-solutions-library-samples / guidance-for-secure-blockchain-validation-using-aws-nitro-enclaves

This Guidance shows how to deploy a secure, scalable, and cost-efficient blockchain key management solution for blockchain validation workloads like Ethereum 2.0 proof-of-stake networks.
https://aws.amazon.com/solutions/guidance/secure-blockchain-validation-using-aws-nitro-enclaves/
MIT No Attribution
14 stars 5 forks source link

SSL version error when connecting to status endpoint of web3signer using lambda #5

Closed EugeneFinch closed 11 months ago

EugeneFinch commented 12 months ago

We tried 3 days and got until the point that web3signer is showing to run successfully and all the certificates and policies are in the right place.

But when trying to retrieve the status using the lambda function an error appears that we can not debug.

{
  "errorMessage": "exception happened: HTTPSConnectionPool(host='signer.prodnitrovalidator.private', port=443): Max retries exceeded with url: /upcheck (Caused by SSLError(SSLError(1, '[SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:1129)')))",
  "errorType": "Exception",
  "requestId": "e3461532-36e6-4ce2-a818-3e09bbb31add",
  "stackTrace": [
    "  File \"/var/task/lambda_function.py\", line 132, in lambda_handler\n    raise Exception(\"exception happened: {}\".format(e))\n"
  ]
}

We tried updating the web3signer docker image to the latest version and jdk17. running out of ideas to try and would love to finish this but currently can only give up.

looks like the SSL version that is supported by the server is not supported by the python library which uses "OpenSSL 1.1.1t 7 Feb 2023" as tested by writing a lambda function that returns the SSL version in use by python. This should actually support any recent SSL version.

plz advice

dpdornseifer commented 11 months ago

Thanks for reaching out @EugeneFinch , will investigate. Can you please share all of the customizations you applied - e.g. the new docker image tag?

czarly commented 11 months ago

it didn't work with the image tag from the original tutorial as well. changing the image tag to 23.9-jdk17 didn't change anything. no more customizations applied.

EugeneFinch commented 11 months ago

@dpdornseifer

dpdornseifer commented 11 months ago

@czarly thanks for providing the details so you are not able to get the web3signer to work at all with the default setup if I understand correctly?

EugeneFinch commented 11 months ago

Correct @dpdornseifer. When testing Lambda function we can't get Web3signer to operate. (Step 07)

However, it appears to be possible to start signer service (Step -06) in your walkthrough doc https://github.com/aws-samples/aws-nitro-enclave-blockchain-validator/blob/main/docs/walkthrough.md

Output results:

● nitro-signing-server.service - Nitro Enclaves Signing Server Loaded: loaded (/etc/systemd/system/nitro-signing-server.service; enabled; vendor preset: disabled) Active: active (running) since Fri 2023-10-06 09:39:06 UTC; 3 days ago Main PID: 3955 (python3) Tasks: 5 Memory: 69.0M CGroup: /system.slice/nitro-signing-server.service ├─3955 python3 /home/ec2-user/app/watchdog.py └─3993 /bin/nitro-cli run-enclave --cpu-count 2 --memory 3806 --eif-path /home/ec2-user/app/server/signing_server.eif --enclave-cid 16 Oct 06 09:39:06 ip-10-0-215-172.ap-southeast-1.compute.internal systemd[1]: Started Nitro Enclaves Signing Server. Oct 06 09:39:06 ip-10-0-215-172.ap-southeast-1.compute.internal watchdog.py[3955]: Start allocating memory... Oct 06 09:39:08 ip-10-0-215-172.ap-southeast-1.compute.internal watchdog.py[3955]: Started enclave with enclave-cid: 16, memory: 3806 MiB, cpu-ids: [1, 3] i-071600d91e147a314:

● nitro-signing-server.service - Nitro Enclaves Signing Server Loaded: loaded (/etc/systemd/system/nitro-signing-server.service; enabled; vendor preset: disabled) Active: active (running) since Fri 2023-10-06 09:38:38 UTC; 3 days ago Main PID: 3956 (python3) Tasks: 5 Memory: 69.1M CGroup: /system.slice/nitro-signing-server.service ├─3956 python3 /home/ec2-user/app/watchdog.py └─3996 /bin/nitro-cli run-enclave --cpu-count 2 --memory 3806 --eif-path /home/ec2-user/app/server/signing_server.eif --enclave-cid 16 Oct 06 09:38:38 ip-10-0-159-168.ap-southeast-1.compute.internal systemd[1]: Started Nitro Enclaves Signing Server. Oct 06 09:38:38 ip-10-0-159-168.ap-southeast-1.compute.internal watchdog.py[3956]: Start allocating memory... Oct 06 09:38:40 ip-10-0-159-168.ap-southeast-1.compute.internal watchdog.py[3956]: Started enclave with enclave-cid: 16, memory: 3806 MiB, cpu-ids: [1, 3] { "Version": 9, "Tier": "Standard"

dpdornseifer commented 11 months ago

Hi @EugeneFinch @czarly I cannot reproduce the error right now. I added a small e2e test script that automatically deploys the stack and configures all the private keys as specified in the walkthrough.md file

https://github.com/aws-samples/aws-nitro-enclave-blockchain-validator/blob/main/tests/e2e/e2e_setup.sh Can you please deploy the stack using the script and share the result with me

EugeneFinch commented 11 months ago

Hi @dpdornseifer having exactly the same issue as before when running the script

Start: num_validators = 1 chain = goerli mnemonic_language = english withdrawal_address = 0x6f4b46423fc6181a0cf34e6716c220bd4d6c2471 Stack ID f97f90d0-67e6-11ee-858a-02eef778e902 will be used as web3signer_uuid Mnemonic generated! 1 / 1 - Encrypted validator key generated - pubkey: a361eb0d4999d6429a5b023389b3f238529008254dc37d9bfbf0348c0b40a91fe56ae92d3546292f7d6b492a1a42e486 1 / 1 - Deposit data generated - pubkey: a361eb0d4999d6429a5b023389b3f238529008254dc37d9bfbf0348c0b40a91fe56ae92d3546292f7d6b492a1a42e486 1 / 1 - Encrypting key, password and mnemonic using KMS 1 / 1 - Record - {'web3signer_uuid': 'f97f90d0-67e6-11ee-858a-02eef778e902', 'chain': 'goerli', 'pubkey': 'a361eb0d4999d6429a5b023389b3f238529008254dc37d9bfbf0348c0b40a91fe56ae92d3546292f7d6b492a1a42e486', 'encrypted_key_password_mnemonic_b64': 'AQICAHhs81lsZJCfsvV7wUp+yyna6PRm99zoRwzMQGtYCokXcQG3t654mupZ4tK53M0HfAzVAAAFTDCCBUgGCSqGSIb3DQEHBqCCBTkwggU1AgEAMIIFLgYJKoZIhvcNAQcBMB4GCWCGSAFlAwQBLjARBAxF0+kPAYA5ewEnqPwCARCAggT/ZuAE6o2UjN72DfZAm7waWM9/wvf8KeD6l5F+NegTrnHQT2x0KH62ewVOfoO+Qq4jgROrnSNg/rd3cRmPLWz539AlttYvDwKfW0YVfwg8tdlwh1+zo2D8G46gVyoL3xIKOo46D4wTglB/yx8czHU2Y4M/W6ogYGxke0HotXITw6okxenqivNjiWtmp7PwX9h5ZBWqWKvmWdw8jaRGEp7sTtg+6IgLNsZdWMsApisVyMPXtcvpq32evlHx4VF2SJexeEaggRBX+1Djpchk2BvOam5wVTE9eomJVsNoUxZxnwhsiCsueCmZa8rvCSxzic4O/yUKbMTgh+MzpVO+T8TGzlT/oVzCR8GLxZJ8h1vxpYJ0Mq1wLNt2p+/xd5RLL37Feme4GHayzrTnGc7xp9ueVQuhX2+r9JyXpCwbXD/tQQsPgk3Q9k3MTU2C70XjNSnQNIU3yUVDY6phhgeBx0WuoxwNClPO/9Sda1uY63Z7bxEMWhCRnDaJUY3d0/XCnw6aBWnxU8o2Dd4F+GZJOlogoffgHomNUFjc99lnnB9c+jaNWSCFFcQ/WIMRlq06ORlETLYO16xGlcpLNwOrDQhM2GMK71WHRcrT7sUsvjAiiu/yw3icC/V7CRe01N7orFl9gLwweXbwqV0WXXUTG05CotKeplGRNLQ1NXIccwyMUpiE/cFf833DHWq+mOePhvjozSryQ9NVy9h0cP6IXhDdH82me4z2BFdXw6NI7AhvcyKci9/1WFyc8+hpdq1HfBAXKg5HTCUPNZCcrXztIB3KfE3EOJr1umc4zkpYqtTgnRjT4UI3nmnM+wBzX1aPzvxDn2YbYtO41nzeM3GEHFonmNBX3sizxM4phFp0rdrfWcWKVWvQGa3ZGqNPUHDqvCe0aiUaARTlUUX/T5LT58Lf9bg3jGWp+Z31Bb/Ez+PQ3dM9HjKE8tSuSEuhoGCXVUhjDlTqCE/pxh+excAnxAruJLaRXZTwF75zHrQZFInTK7KdfBbIghWVHM0SfUFrTaNzo++usxV623xCGEZ19N1bxXJ1T1C6BCnAFnvG9sBcCogPF15MBfq3f170mrNskL04RgR94RT1iiWg4E0C8IH7PCyqyAu37HTsOFQ+agAciQMlYzLW4AiVWbFwebIsElDt+tp8YMbt8UmarBOxpRpQjMNHG6oJckzhH+cDIM0zi7rYChQnEkIDMsf4gpF4oB2onXCW01GlIgfp0wOI35hPcX0WRYk++DKqQo1bXoFH9pUC/USu9uKiG9lYe85DppWhVwAQxSk8x7X70nQ/u9SOV4Hmn1PYLwkGGDtMrl9FA0ZhXLpWaDGZKrE4MgLwi4n47LQgBwFjwgPox7ki4p/OHPfgr2v34fZ8aGmgv5pupRE+h//9Ni4NFRdWMQi64poqvQ1RqJCHYQqpdP0Pb8VUPwKXkB5SbtRfPwl27fTMamTYwSCnP1fu03Ok0Ssh5SgKWRbHN4N/ZCaD56a5ZbAMHGmJXM7TiEbSp3o0PAu3TxT8JKEgMpk2+nbQeno9f0A+CfHjb3eEe9QOD3Q4321enBhOkM9d6y0x4nn7a/Pavabxl75aJqSXnJ4MmtwuCjNx62OgcURCdVLIsY2j5Kxz00WAea9JPib+CzoqubZ7FGH/ns7qYE8u/GzCN65idfSoaN3wbucyZhGyhwS7OWAe7ybcjvh2cpuVkSII/+lksA==', 'deposit_json_b64': 'W3sicHVia2V5IjogImEzNjFlYjBkNDk5OWQ2NDI5YTViMDIzMzg5YjNmMjM4NTI5MDA4MjU0ZGMzN2Q5YmZiZjAzNDhjMGI0MGE5MWZlNTZhZTkyZDM1NDYyOTJmN2Q2YjQ5MmExYTQyZTQ4NiIsICJ3aXRoZHJhd2FsX2NyZWRlbnRpYWxzIjogIjAxMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDZmNGI0NjQyM2ZjNjE4MWEwY2YzNGU2NzE2YzIyMGJkNGQ2YzI0NzEiLCAiYW1vdW50IjogMzIwMDAwMDAwMDAsICJzaWduYXR1cmUiOiAiODJhMzM4NmZhMjkxMTdmNWI5ZWY3MDA2ZjE5MDI5MTBhOTdiMDgxMTAyYjIxNDcyODkwNGVmMDAzMjBhNjE2ZjAwMDYxNDllMTEwNzgxNmViMDE2ZjcyNWU4ODNmNjUxMDExMjMwMTZmMzMzMmU5ZGNmNTc5MjY5ZTEzNmJiMjZmMWUyMDlkYWMxODIwYmQyNjExM2I2MTE5NTgxYzBkN2ZhYThlZTRkZTQ4N2QxNjYyZWJmNGZmNmY5OWMxZGEzIiwgImRlcG9zaXRfbWVzc2FnZV9yb290IjogIjc4ZTRhZjliYmU2NDQyOWRjYzk4ZjM5NjMyMDc4MmI4YjI4NTM0ZjM1NzYzNjUxOWNlNjY5N2M2ODE1OWI5NGEiLCAiZGVwb3NpdF9kYXRhX3Jvb3QiOiAiYzM2ZTk4MjkzMDk5OGE1ODcwYjFiYTU5ZWZmMDQxYTMzZTQ0Mjc0MDc2N2U1ZTg5YzMyOTkzOTk3Zjk5MDBjZCIsICJmb3JrX3ZlcnNpb24iOiAiMDAwMDEwMjAiLCAibmV0d29ya19uYW1lIjogImdvZXJsaSIsICJkZXBvc2l0X2NsaV92ZXJzaW9uIjogIjIuMy4wIn1d', 'datetime': '2023-10-11T05:37:52.397849', 'active': True} Writing validator keys record to DynamoDB Successfully written validator keys record to DynamoDB ['a361eb0d4999d6429a5b023389b3f238529008254dc37d9bfbf0348c0b40a91fe56ae92d3546292f7d6b492a1a42e486']

(11/10/2023 05:38:16) service has been started and is healthy

11/10/2023 05:38:16: sending request { "StatusCode": 200, "FunctionError": "Unhandled", "ExecutedVersion": "$LATEST" } result: {"errorMessage": "exception happened: HTTPSConnectionPool(host='signer.devnitrovalidator.private', port=443): Max retries exceeded with url: /upcheck (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate (_ssl.c:1129)')))", "errorType": "Exception", "requestId": "3788bb6d-319d-4729-bd35-dee7f095cd7f", "stackTrace": [" File \"/var/task/lambda_function.py\", line 132, in lambda_handler\n raise Exception(\"exception happened: {}\".format(e))\n"]}

11/10/2023 05:38:17: sending request { "StatusCode": 200, "FunctionError": "Unhandled", "ExecutedVersion": "$LATEST" } result: {"errorMessage": "exception happened: HTTPSConnectionPool(host='signer.devnitrovalidator.private', port=443): Max retries exceeded with url: /api/v1/eth2/publicKeys (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate (_ssl.c:1129)')))", "errorType": "Exception", "requestId": "89f458c0-e349-41d1-a5b2-49c33fb210d0", "stackTrace": [" File \"/var/task/lambda_function.py\", line 148, in lambda_handler\n raise Exception(\"exception happened: {}\".format(e))\n"]}

11/10/2023 05:38:19: sending request { "StatusCode": 200, "FunctionError": "Unhandled", "ExecutedVersion": "$LATEST" } result: {"errorMessage": "exception happened: HTTPSConnectionPool(host='signer.devnitrovalidator.private', port=443): Max retries exceeded with url: /upcheck (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate (_ssl.c:1129)')))", "errorType": "Exception", "requestId": "6ebdf46f-eb2a-4517-9e45-64f038151f69", "stackTrace": [" File \"/var/task/lambda_function.py\", line 132, in lambda_handler\n raise Exception(\"exception happened: {}\".format(e))\n"]}

11/10/2023 05:38:19: sending request { "StatusCode": 200, "FunctionError": "Unhandled", "ExecutedVersion": "$LATEST" } result: {"errorMessage": "exception happened: HTTPSConnectionPool(host='signer.devnitrovalidator.private', port=443): Max retries exceeded with url: /api/v1/eth2/publicKeys (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate (_ssl.c:1129)')))", "errorType": "Exception", "requestId": "97e42a0d-99b1-4ea9-974c-ab395ef7b852", "stackTrace": [" File \"/var/task/lambda_function.py\", line 148, in lambda_handler\n raise Exception(\"exception happened: {}\".format(e))\n"]}

....

dpdornseifer commented 11 months ago

Thanks @EugeneFinch , just wondering - did you delete the old stack before executing the stack by running cdk destroy devNitroValidator? It seems that the error you are facing has changed now - its not related to the version number but to the cert in general.

EugeneFinch commented 11 months ago

@dpdornseifer nope I did not delete the old stack. I used the previous one created before.

dpdornseifer commented 11 months ago

@EugeneFinch ok, can you please test the deployment in a different region - e.g. eu-central-1 or delete the old stack first. The error above indicates an issue with non-matching certs which could be caused by rerunning some of the scripts.

EugeneFinch commented 11 months ago

All right. Will re-run and get back to you.

EugeneFinch commented 11 months ago

On the side note, there were things during installation process where I had to improvise. For example on step - 10 Deploy the sample code with the AWS CDK CLI doesn't automatically create S3 bucket and so you need to do it manually, see screenshot of the error attached.

Screen Shot 2023-10-12 at 1 10 10 PM
EugeneFinch commented 11 months ago

Re-deployed. It works now. Tested web3signer function and its okay. Thanks

Screen Shot 2023-10-12 at 3 26 44 PM

What we did earlier is change config of app.py and run cdk deploy devNitroValidator -O output.json again, which apparently caused the above issue. The right way would have been to run cdk destroy and then deploy again.

dpdornseifer commented 11 months ago

HI Eugene,

with regards to this issue you have to run cdk bootstrap aws://ACCOUNT-NUMBER-1/REGION-1 aws://ACCOUNT-NUMBER-2/REGION-2 ...1 as specified here.

With regards to the 2nd issue - re-deployment in general works and you can also use cdk deploy to apply updates and new config to the existing stack. In this example it just depends on the order of commands etc. because there is local tls cert generation involved as well. Rerunning the entire process on an existing stack without updating the enclaves will eventually lead to a situation where you have two not matching tls certs, the error we saw earlier.

Happy that it works now - closing the issue

EugeneFinch commented 11 months ago

Thank you