Azure / iot-identity-service

Source of the Azure IoT Identity Service and related services.
MIT License
37 stars 46 forks source link

SoftHSM + EST server | C_Decrypt failed with CKR_BUFFER_TOO_SMALL #486

Closed mquillin closed 1 year ago

mquillin commented 1 year ago

I am running into what looks like a bug when I spin up the edge runtime services. At first everything seems to work fine and the device is associated to the IoT Hub and generates all the keys and certs - but as soon as I add it to a deployment it breaks. Several containers are successfully deployed before hitting this error. I have a total of 7 containers in addition to edgeAgent and edgeHub.

I am doing a pretty basic install on a raspberry pi with 64 bit os lite

# install edge runtime
curl https://packages.microsoft.com/config/debian/11/packages-microsoft-prod.deb > ./packages-microsoft-prod.deb && sudo apt install ./packages-microsoft-prod.deb && sudo apt-get update

sudo apt-get install -y moby-engine
sudo apt-get install -y aziot-edge

sudo apt-get install -y softhsm2:armhf && sudo usermod -aG softhsm aziotks
sudo rm -rf /var/lib/softhsm/tokens/* && sudo softhsm2-util --init-token --free --label my-token --so-pin so1234 --pin 1234 && sudo chown -R aziotks:aziotks /var/lib/softhsm/tokens

config.toml

[provisioning]
source = "dps"
global_endpoint = "https://global.azure-devices-provisioning.net/"
id_scope = "0nexxxx"

[provisioning.attestation]
method = "x509"
registration_id = "10000000a4c116f9"

[provisioning.attestation.identity_cert]
method = "est"
common_name = "10000000a4c116f9"

[provisioning.attestation.identity_cert.auto_renew]
rotate_key = false
threshold = "80%"
retry = "4%"

[cert_issuance.est.auth]
username = "xxx"
password = "xxxx"

[cert_issuance.est.urls]
default = "https://xxxxx/.well-known/est/"

[aziot_keys]
pkcs11_lib_path = "/usr/lib/softhsm/libsofthsm2.so"
pkcs11_base_slot = "pkcs11:token=my-token?pin-value=1234"
Nov 29 12:36:04 MLQ aziot-edged[7120]: 2022-11-29T20:36:04Z [INFO] - Successfully queried system info
Nov 29 12:36:04 MLQ aziot-edged[7120]: 2022-11-29T20:36:04Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 29 12:36:04 MLQ aziot-edged[7120]: 2022-11-29T20:36:04Z [INFO] - <-- POST /modules/%24edgeAgent/genid/638053484245984049/decrypt?api-version=2022-08-03 {"accept": "application/json", "host": "workload.sock:80", "connection": "close", "content-type": "application/json", "content-length": "5363"}
Nov 29 12:36:04 MLQ aziot-keyd[7138]: 2022-11-29T20:36:04Z [INFO] - <-- POST /key?api-version=2020-09-01 {"content-type": "application/json", "host": "keyd.sock", "content-length": "74"}
Nov 29 12:36:04 MLQ aziot-keyd[7138]: 2022-11-29T20:36:04Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 29 12:36:04 MLQ aziot-keyd[7138]: 2022-11-29T20:36:04Z [INFO] - <-- POST /decrypt?api-version=2020-09-01 {"content-type": "application/json", "host": "keyd.sock", "content-length": "5695"}
Nov 29 12:36:04 MLQ aziot-keyd[7138]: 2022-11-29T20:36:04Z [ERR!] - C_Decrypt failed with CKR_BUFFER_TOO_SMALL
Nov 29 12:36:04 MLQ aziot-keyd[7138]: 2022-11-29T20:36:04Z [ERR!] - !!! internal error
Nov 29 12:36:04 MLQ aziot-keyd[7138]: 2022-11-29T20:36:04Z [ERR!] - !!! caused by: could not decrypt
Nov 29 12:36:04 MLQ aziot-keyd[7138]: 2022-11-29T20:36:04Z [ERR!] - !!! caused by: could not decrypt: AZIOT_KEYS_RC_ERR_EXTERNAL
Nov 29 12:36:04 MLQ aziot-keyd[7138]: 2022-11-29T20:36:04Z [INFO] - --> 500 {"content-type": "application/json"}
Nov 29 12:36:04 MLQ aziot-edged[7120]: 2022-11-29T20:36:04Z [INFO] - --> 500 {"content-type": "application/json"}
Nov 29 12:36:04 MLQ aziot-edged[7120]: 2022-11-29T20:36:04Z [INFO] - <-- POST /modules/%24edgeAgent/genid/638053484245984049/decrypt?api-version=2022-08-03 {"accept": "application/json", "host": "workload.sock:80", "connection": "close", "content-type": "application/json", "content-length": "5363"}
Nov 29 12:36:04 MLQ aziot-keyd[7138]: 2022-11-29T20:36:04Z [INFO] - <-- POST /key?api-version=2020-09-01 {"content-type": "application/json", "host": "keyd.sock", "content-length": "74"}
Nov 29 12:36:04 MLQ aziot-keyd[7138]: 2022-11-29T20:36:04Z [ERR!] - could not find objects
Nov 29 12:36:04 MLQ aziot-keyd[7138]: 2022-11-29T20:36:04Z [ERR!] - !!! internal error
Nov 29 12:36:04 MLQ aziot-keyd[7138]: 2022-11-29T20:36:04Z [ERR!] - !!! caused by: could not generate key
Nov 29 12:36:04 MLQ aziot-keyd[7138]: 2022-11-29T20:36:04Z [ERR!] - !!! caused by: could not create key: AZIOT_KEYS_RC_ERR_EXTERNAL
Nov 29 12:36:04 MLQ aziot-keyd[7138]: 2022-11-29T20:36:04Z [INFO] - --> 500 {"content-type": "application/json"}
Nov 29 12:36:04 MLQ aziot-edged[7120]: 2022-11-29T20:36:04Z [INFO] - --> 500 {"content-type": "application/json"}
Nov 29 12:36:04 MLQ aziot-keyd[7138]: 2022-11-29T20:36:04Z [INFO] - <-- POST /encrypt?api-version=2021-05-01 {"content-length": "355", "content-type": "application/json"}
Nov 29 12:36:04 MLQ aziot-keyd[7138]: 2022-11-29T20:36:04Z [ERR!] - could not find objects
Nov 29 12:36:04 MLQ aziot-keyd[7138]: 2022-11-29T20:36:04Z [ERR!] - !!! internal error
Nov 29 12:36:04 MLQ aziot-keyd[7138]: 2022-11-29T20:36:04Z [ERR!] - !!! caused by: could not generate key
Nov 29 12:36:04 MLQ aziot-keyd[7138]: 2022-11-29T20:36:04Z [ERR!] - !!! caused by: could not create key: AZIOT_KEYS_RC_ERR_EXTERNAL
Nov 29 12:36:04 MLQ aziot-keyd[7138]: 2022-11-29T20:36:04Z [INFO] - --> 500 {"content-type": "application/json"}
Nov 29 12:36:04 MLQ aziot-identityd[7125]: 2022-11-29T20:36:04Z [WARN] - Failed to send HTTP request (attempt 1 of 2): connection error: error:80005001:aziot_key_openssl_engine:aziot_key_rsa_priv_enc::key/aziot-key-openssl-engine/src/lib.rs:112:internal error, error:141F0006:SSL routines:tls_construct_cert_verify:EVP lib:../ssl/statem/statem_lib.c:308:

edgeAgent error logs for reference

<3> 2022-11-29 21:42:11.586 +00:00 [ERR] - Edge agent plan execution failed.
Microsoft.Azure.Devices.Edge.Util.Edged.WorkloadCommunicationException- Message:Error calling Encrypt: unable to load master encryption key: internal error, StatusCode:500, at:   at Microsoft.Azure.Devices.Edge.Util.Edged.Version_2019_01_30.WorkloadClient.HandleException(Exception ex, String operation) in /mnt/vss/_work/1/s/edge-util/src/Microsoft.Azure.Devices.Edge.Util/edged/version_2019_01_30/WorkloadClient.cs:line 108
   at Microsoft.Azure.Devices.Edge.Util.Edged.WorkloadClientVersioned.Execute[T](Func`1 func, String operation) in /mnt/vss/_work/1/s/edge-util/src/Microsoft.Azure.Devices.Edge.Util/edged/WorkloadClientVersioned.cs:line 81
   at Microsoft.Azure.Devices.Edge.Util.Edged.Version_2019_01_30.WorkloadClient.EncryptAsync(String initializationVector, String plainText) in /mnt/vss/_work/1/s/edge-util/src/Microsoft.Azure.Devices.Edge.Util/edged/version_2019_01_30/WorkloadClient.cs:line 69
   at Microsoft.Azure.Devices.Edge.Agent.Core.Agent.UpdateCurrentConfig(DeploymentConfigInfo deploymentConfigInfo) in /mnt/vss/_work/1/s/edge-agent/src/Microsoft.Azure.Devices.Edge.Agent.Core/Agent.cs:line 312
   at Microsoft.Azure.Devices.Edge.Agent.Core.Agent.ReconcileAsync(CancellationToken token) in /mnt/vss/_work/1/s/edge-agent/src/Microsoft.Azure.Devices.Edge.Agent.Core/Agent.cs:line 208
<4> 2022-11-29 21:42:17.546 +00:00 [WRN] - Reconcile failed because of the an exception
Microsoft.Azure.Devices.Edge.Util.Edged.WorkloadCommunicationException- Message:Error calling Encrypt: unable to load master encryption key: internal error, StatusCode:500, at:   at Microsoft.Azure.Devices.Edge.Util.Edged.Version_2019_01_30.WorkloadClient.HandleException(Exception ex, String operation) in /mnt/vss/_work/1/s/edge-util/src/Microsoft.Azure.Devices.Edge.Util/edged/version_2019_01_30/WorkloadClient.cs:line 108
   at Microsoft.Azure.Devices.Edge.Util.Edged.WorkloadClientVersioned.Execute[T](Func`1 func, String operation) in /mnt/vss/_work/1/s/edge-util/src/Microsoft.Azure.Devices.Edge.Util/edged/WorkloadClientVersioned.cs:line 81
   at Microsoft.Azure.Devices.Edge.Util.Edged.Version_2019_01_30.WorkloadClient.EncryptAsync(String initializationVector, String plainText) in /mnt/vss/_work/1/s/edge-util/src/Microsoft.Azure.Devices.Edge.Util/edged/version_2019_01_30/WorkloadClient.cs:line 69
   at Microsoft.Azure.Devices.Edge.Agent.Core.Agent.UpdateCurrentConfig(DeploymentConfigInfo deploymentConfigInfo) in /mnt/vss/_work/1/s/edge-agent/src/Microsoft.Azure.Devices.Edge.Agent.Core/Agent.cs:line 314
   at Microsoft.Azure.Devices.Edge.Agent.Core.Agent.ReconcileAsync(CancellationToken token) in /mnt/vss/_work/1/s/edge-agent/src/Microsoft.Azure.Devices.Edge.Agent.Core/Agent.cs:line 208

This is very easily reproduced on my end

Also note that everything works as expected when these lines are removed from the config

[aziot_keys]
pkcs11_lib_path = "/usr/lib/softhsm/libsofthsm2.so"
pkcs11_base_slot = "pkcs11:token=my-token?pin-value=1234"
micahl commented 1 year ago

@mquillin ack... I can repro this as well.

micahl commented 1 year ago

@arsing any insight into what might be the issue here? I saw the same behavior. EdgeHub starts up and when it attempts to decrypt it hits this error in the keys service. Everything else appeared to work prior (provisioning via DPS using EST for cert issuance).

Nov 30 07:01:11 fooDeb aziot-edged[81616]: 2022-11-30T07:01:11Z [INFO] - Starting module edgeHub...
Nov 30 07:01:11 fooDeb aziot-edged[81616]: 2022-11-30T07:01:11Z [INFO] - Starting new listener for module edgeHub
Nov 30 07:01:11 fooDeb aziot-edged[81616]: 2022-11-30T07:01:11Z [INFO] - Starting workload API...
Nov 30 07:01:11 fooDeb aziot-edged[81616]: 2022-11-30T07:01:11Z [INFO] - --> 204 {}
Nov 30 07:01:11 fooDeb aziot-edged[81616]: 2022-11-30T07:01:11Z [INFO] - <-- POST /modules/%24edgeAgent/genid/637907616673364985/encrypt?api-version=2022-08-03 {"accept": "application/json", "host": "workload.sock:80", "connection": "close", "content-type": "application/json", "content-length": "2442"}
Nov 30 07:01:11 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:11Z [INFO] - <-- POST /key?api-version=2020-09-01 {"content-type": "application/json", "host": "keyd.sock", "content-length": "74"}
Nov 30 07:01:11 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:11Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 30 07:01:11 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:11Z [INFO] - <-- POST /encrypt?api-version=2020-09-01 {"content-type": "application/json", "host": "keyd.sock", "content-length": "2774"}
Nov 30 07:01:11 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:11Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 30 07:01:11 fooDeb aziot-edged[81616]: 2022-11-30T07:01:11Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 30 07:01:11 fooDeb aziot-edged[81616]: 2022-11-30T07:01:11Z [INFO] - <-- POST /modules/%24edgeHub/genid/637907616673364985/certificate/server?api-version=2020-07-07 {"accept": "application/json", "host": "workload.sock:80", "connection": "close", "content-type": "application/json", "content-length": "67"}
Nov 30 07:01:11 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:11Z [INFO] - <-- GET /keypair/aziot-edged-ca?api-version=2020-09-01 {"host": "keyd.sock"}
Nov 30 07:01:11 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:11Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 30 07:01:11 fooDeb aziot-certd[81624]: 2022-11-30T07:01:11Z [INFO] - <-- POST /certificates?api-version=2020-09-01 {"content-type": "application/json", "host": "certd.sock", "content-length": "1361"}
Nov 30 07:01:11 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:11Z [INFO] - <-- POST /parameters/algorithm?api-version=2021-05-01 {"content-length": "252", "content-type": "application/json"}
Nov 30 07:01:11 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:11Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 30 07:01:11 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:11Z [INFO] - <-- POST /parameters/rsa-modulus?api-version=2021-05-01 {"content-length": "252", "content-type": "application/json"}
Nov 30 07:01:11 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:11Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 30 07:01:11 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:11Z [INFO] - <-- POST /parameters/rsa-exponent?api-version=2021-05-01 {"content-length": "252", "content-type": "application/json"}
Nov 30 07:01:11 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:11Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 30 07:01:11 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:11Z [INFO] - <-- POST /encrypt?api-version=2021-05-01 {"content-length": "359", "content-type": "application/json"}
Nov 30 07:01:11 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:11Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 30 07:01:11 fooDeb aziot-certd[81624]: 2022-11-30T07:01:11Z [INFO] - --> 201 {"content-type": "application/json"}
Nov 30 07:01:11 fooDeb aziot-edged[81616]: 2022-11-30T07:01:11Z [INFO] - --> 201 {"content-type": "application/json"}
Nov 30 07:01:12 fooDeb aziot-edged[81616]: 2022-11-30T07:01:12Z [INFO] - <-- GET /trust-bundle?api-version=2020-07-07 {"accept": "application/json", "host": "workload.sock:80", "connection": "close"}
Nov 30 07:01:12 fooDeb aziot-certd[81624]: 2022-11-30T07:01:12Z [INFO] - <-- GET /certificates/aziot-edged-trust-bundle?api-version=2020-09-01 {"host": "certd.sock"}
Nov 30 07:01:12 fooDeb aziot-certd[81624]: 2022-11-30T07:01:12Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 30 07:01:12 fooDeb aziot-edged[81616]: 2022-11-30T07:01:12Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 30 07:01:12 fooDeb aziot-edged[81616]: 2022-11-30T07:01:12Z [INFO] - <-- POST /modules/%24edgeHub/genid/637907616673364985/sign?api-version=2020-07-07 {"accept": "application/json", "host": "workload.sock:80", "connection": "close", "content-type": "application/json", "content-length": "169"}
Nov 30 07:01:12 fooDeb aziot-identityd[81620]: 2022-11-30T07:01:12Z [INFO] - <-- GET /identities/modules/$edgeHub?api-version=2020-09-01&type=aziot {"host": "identityd.sock"}
Nov 30 07:01:13 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:13Z [INFO] - <-- POST /encrypt?api-version=2021-05-01 {"content-length": "355", "content-type": "application/json"}
Nov 30 07:01:13 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:13Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 30 07:01:13 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:13Z [INFO] - <-- GET /key/aziot_identityd_master_id?api-version=2021-05-01 {"host": "keyd.sock"}
Nov 30 07:01:13 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:13Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 30 07:01:13 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:13Z [INFO] - <-- POST /derivedkey?api-version=2021-05-01 {"content-type": "application/json", "host": "keyd.sock", "content-length": "336"}
Nov 30 07:01:13 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:13Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 30 07:01:13 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:13Z [INFO] - <-- POST /derivedkey/export?api-version=2021-05-01 {"content-type": "application/json", "host": "keyd.sock", "content-length": "724"}
Nov 30 07:01:13 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:13Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 30 07:01:13 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:13Z [INFO] - <-- POST /derivedkey?api-version=2021-05-01 {"content-type": "application/json", "host": "keyd.sock", "content-length": "340"}
Nov 30 07:01:13 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:13Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 30 07:01:13 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:13Z [INFO] - <-- POST /derivedkey/export?api-version=2021-05-01 {"content-type": "application/json", "host": "keyd.sock", "content-length": "736"}
Nov 30 07:01:13 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:13Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 30 07:01:13 fooDeb aziot-identityd[81620]: 2022-11-30T07:01:13Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 30 07:01:13 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:13Z [INFO] - <-- POST /sign?api-version=2020-09-01 {"content-type": "application/json", "host": "keyd.sock", "content-length": "898"}
Nov 30 07:01:13 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:13Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 30 07:01:13 fooDeb aziot-edged[81616]: 2022-11-30T07:01:13Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 30 07:01:13 fooDeb aziot-edged[81616]: 2022-11-30T07:01:13Z [INFO] - <-- POST /modules/%24edgeHub/genid/637907616673364985/encrypt?api-version=2020-07-07 {"accept": "application/json", "host": "workload.sock:80", "connection": "close", "content-type": "application/json", "content-length": "438"}
Nov 30 07:01:13 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:13Z [INFO] - <-- POST /key?api-version=2020-09-01 {"content-type": "application/json", "host": "keyd.sock", "content-length": "74"}
Nov 30 07:01:13 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:13Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 30 07:01:13 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:13Z [INFO] - <-- POST /encrypt?api-version=2020-09-01 {"content-type": "application/json", "host": "keyd.sock", "content-length": "766"}
Nov 30 07:01:13 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:13Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 30 07:01:13 fooDeb aziot-edged[81616]: 2022-11-30T07:01:13Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 30 07:01:13 fooDeb aziot-edged[81616]: 2022-11-30T07:01:13Z [INFO] - <-- POST /modules/%24edgeHub/genid/637907616673364985/decrypt?api-version=2020-07-07 {"accept": "application/json", "host": "workload.sock:80", "connection": "close", "content-type": "application/json", "content-length": "463"}
Nov 30 07:01:13 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:13Z [INFO] - <-- POST /key?api-version=2020-09-01 {"content-type": "application/json", "host": "keyd.sock", "content-length": "74"}
Nov 30 07:01:13 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:13Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 30 07:01:13 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:13Z [INFO] - <-- POST /decrypt?api-version=2020-09-01 {"content-type": "application/json", "host": "keyd.sock", "content-length": "791"}
Nov 30 07:01:13 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:13Z [ERR!] - C_Decrypt failed with CKR_BUFFER_TOO_SMALL
Nov 30 07:01:13 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:13Z [ERR!] - !!! internal error
Nov 30 07:01:13 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:13Z [ERR!] - !!! caused by: could not decrypt
Nov 30 07:01:13 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:13Z [ERR!] - !!! caused by: could not decrypt: AZIOT_KEYS_RC_ERR_EXTERNAL
Nov 30 07:01:13 fooDeb aziot-keyd[81633]: 2022-11-30T07:01:13Z [INFO] - --> 500 {"content-type": "application/json"}
arsing commented 1 year ago

EA and EH are the first (and usually only) callers of the symmkey encrypt/decrypt API. Also they only decrypt when they have something to decrypt, so I think the deployment itself will go fine and decrypt will only happen if they're restarted. So the fact that everything works before that point is to be expected.

As for why softhsm's C_Decrypt returns CKR_BUFFER_TOO_SMALL for something that was encrypted by the same library's C_Encrypt, we'd need to debug it to be sure. I don't see any issues about it in their tracker.

Given we haven't seen this obvious breakage before, is it only broken on Debian 11's version of softhsm?

Lastly, while we obviously should investigate it, I will note that using softhsm doesn't give any benefit over just using filesystem keys, so if you need immediate relief you can remove the PKCS#11 config as you already tried.

arsing commented 1 year ago

Also, using DPS and using EST should not be relevant. Can you confirm the same thing happens even with simple manual-symmkey provisioning and quickstart CA?

micahl commented 1 year ago

@arsing, yes, I see the same error in that configuration as well. I was also on Debian 11 so dunno about others. Will give it a quick try on Ubuntu 20.04.

arsing commented 1 year ago

Given we haven't seen this obvious breakage before, is it only broken on Debian 11's version of softhsm?

Actually it's possible this has always been broken. I don't know if there's ever been a full Edge test with softhsm, and nothing in i-i-s itself would hit the symmkey encrypt/decrypt API.

The error is probably coming from https://github.com/opendnssec/SoftHSMv2/blame/2.6.1/src/lib/SoftHSM.cpp#L3221-L3225 . It asserts that the plaintext buffer is exactly as long as the ciphertext buffer. In our code we give it a plaintext buffer that's 16 bytes smaller than the ciphertext buffer because the ciphertext includes a 16-byte tag. https://github.com/Azure/iot-identity-service/blob/1.4.1/key/aziot-keys/src/key.rs#L510

The softhsm check is overly conservative, but we should cater to it by removing the - 16 anyway.

micahl commented 1 year ago

Ah, that makes sense.

mquillin commented 1 year ago

Thanks for looking into this! I have switched to using the filesystem for now, I want to use softhsm to verify the interface so I know we have a working system before testing various HSM's.

arsing commented 1 year ago

Can you test with the Debian 11 aarch64 artifact from https://github.com/Azure/iot-identity-service/actions/runs/3587571056 ?

micahl commented 1 year ago

@mquillin fwiw I see it working now with @arsing's draft change. Here's what I did to easily replace the bits on my VM:

  1. downloaded and extracted the .deb package I needed
  2. modified the Version in the package's control file from 1.4.0~dev-1 to 1.4.1-1 using this handy script
  3. sudo apt remove aziot-edge aziot-identity-service
  4. sudo apt install ./aziot-identity-service_1.4.0~dev-1_amd64.modfied.deb; sudo apt install aziot-edge
  5. sudo iotedge config apply
mquillin commented 1 year ago

@arsing do you have an ETA when this fix will be released? Also is this changed rolled into the docker container (i.e. mcr.microsoft.com/azureiotedge-agent:1.4.3) or will it require updating with apt?

I saw a new version 1.4.4 and wasn't sure if that included this fix; or in general this service.

arsing commented 1 year ago

https://github.com/Azure/iot-identity-service/pull/490 is the PR for 1.4. It's waiting for backport approval internally.

micahl commented 1 year ago

@mquillin that internal discussion happens tomorrow afternoon so we should be able to provide an update after that.

mquilling commented 1 year ago

@micahl has this been released yet?

I am using https://packages.microsoft.com/config/debian/11/packages-microsoft-prod.deb

and still see some issues on boot

Feb 09 16:35:18 MLQ aziot-keyd[969]: 2023-02-10T00:35:18Z [ERR!] - invalid parameter "id": not found
micahl commented 1 year ago

@mquilling sorry, yes. The fix went in mid-December and was picked up as part of the 1.4.8 patch for IoT Edge although looks like we forgot to mention it in the release notes. The identity service was bumped to 1.4.2. If you're installing the latest aziot-edge and aziot-identity-service from packages.microsoft.com then you should have it.

I believe that error about invalid parameter "id": not found is unrelated (and probably safe to ignore). Do you see it repeatedly occurring or is it more of a one-time thing?

mquilling commented 1 year ago

@micahl thanks for the confirmation!

its working now, I had switched to 32 bit for softhsm, and all my containers broke.

any chance you know if there is a softhsm arm64v8 package?

the standard install is a 32 bit version

micahl commented 1 year ago

Should be: https://packages.debian.org/bullseye/softhsm2

What you're getting / seeing is probably based on how apt-get is configured.

mquilling commented 1 year ago

that is what I am installing, softhsm2-util works and everything

Feb 09 16:25:46 MLQ aziot-keyd[12758]: 2023-02-10T00:25:46Z [ERR!] - could not load library: /usr/lib/softhsm/libsofthsm2.so: wrong ELF class: ELFCLASS64

does this mean the key service is compiled as 32 bit and looking for 32 bit but found 64?

micahl commented 1 year ago

Which RPi OS are you on, 32- or 64-bit?

mquilling commented 1 year ago

64 bit, works on 32 bit fine - but really wanted to use 64

micahl commented 1 year ago

We provide 64-bit packages with each release. You can find them on the github release at https://github.com/Azure/azure-iotedge/releases/tag/1.4.8

mquillin commented 1 year ago

great! I will try installing that instead of the prod link.

for reference I know I can update the edgeAgent and edgeHub containers via the cloud. Does the identity service require an apt update? In general what’s the recommended way to update all the iot edge components?

micahl commented 1 year ago

Yes, the identity service is updated via apt. Refer to our doc page on Update IoT Edge.

In the context of updating if you're looking at upgrading between patch releases (e.g. 1.4.1 -> 1.4.2) then it shouldn't matter whether you first update the system containers and then .deb packages or vice-versa since we keep the interface between the two stable such that a newer version of one should work with an older version of the other.

If upgrading between product releases (e.g. 1.1 -> 1.4) then that flexibility still exists, but we'd recommend attempting to update both in tandem since a specific combination of host components + system containers is what we test. An approach to doing that would be to stop the system, uninstall the deb packages, upgrade Docker if not already on the latest, install the aziot-edge package, update your module deployment manifest in the cloud, then restart the runtime.