Azure / iotedge

The IoT Edge OSS project
MIT License
1.46k stars 460 forks source link

Failed to provision device using DPS and TPM #2954

Closed MPapst closed 4 years ago

MPapst commented 4 years ago

Expected Behavior

When creating a DPS Individual enrollment with tpm the device should be able to provision itself using the dps.

Current Behavior

IoTEdge failes to use the tpm.

Steps to Reproduce

Provide a detailed set of steps to reproduce the bug.

  1. Installed IoT Edge on Ubuntu 18.10 LTS on a brand new Hardware (HPE ML350 with TPM 2.0)
  2. Configured IoT Edge for provisioning with TPM by creating a Registration Id using the TPM
  3. Created an Individual IoT Edge capable enrollment in the portal using the registration id and the endorsement key
  4. restart iotedge

IoT Edge failes to start with the following output in journalctl:

sudo journalctl -u iotedge -f
-- Logs begin at Wed 2020-05-13 09:14:31 UTC. --
May 13 09:51:12 svr-hymon-00002 iotedged[16912]: Error: Time:Wed May 13 09:51:12 2020 File:/project/edgelet/hsm-sys/azure-iot-hsm-c/deps/utpm/src/tpm_codec.c Func:Initialize_TPM_Codec Line:258 creating tpm_comm object
May 13 09:51:12 svr-hymon-00002 iotedged[16912]: 2020-05-13T09:51:12Z [ERR!] (/project/edgelet/hsm-sys/azure-iot-hsm-c/src/hsm_client_tpm_device.c:initialize_tpm_device:273) Failure initializeing TPM Codec
May 13 09:51:12 svr-hymon-00002 iotedged[16912]: 2020-05-13T09:51:12Z [ERR!] (/project/edgelet/hsm-sys/azure-iot-hsm-c/src/hsm_client_tpm_device.c:hsm_client_tpm_create:306) Failure initializing tpm device.
May 13 09:51:12 svr-hymon-00002 systemd[1]: iotedge.service: Main process exited, code=exited, status=1/FAILURE
May 13 09:51:12 svr-hymon-00002 systemd[1]: iotedge.service: Failed with result 'exit-code'.
May 13 09:51:13 svr-hymon-00002 systemd[1]: iotedge.service: Service hold-off time over, scheduling restart.
May 13 09:51:13 svr-hymon-00002 systemd[1]: iotedge.service: Scheduled restart job, restart counter is at 5.
May 13 09:51:13 svr-hymon-00002 systemd[1]: Stopped Azure IoT Edge daemon.
May 13 09:51:13 svr-hymon-00002 systemd[1]: Dependency failed for Azure IoT Edge daemon.
May 13 09:51:13 svr-hymon-00002 systemd[1]: iotedge.service: Job iotedge.service/start failed with result 'dependency'.

Context (Environment)

Output of iotedge check

Click here ``` √ config.yaml is well-formed - OK ‼ config.yaml has well-formed connection string - Warning Device not configured with manual provisioning, in this configuration 'iotedge check' is not able to discover the device's backing IoT Hub. To run connectivity checks in this configuration please specify the backing IoT Hub name using --iothub-hostname switch if you have that information. If no hostname is provided, all hub connectivity tests will be skipped. √ container engine is installed and functional - OK √ config.yaml has correct hostname - OK × config.yaml has correct URIs for daemon mgmt endpoint - Error Error: could not execute list-modules request: an error occurred trying to connect: Connection refused (os error 111) ‼ latest security daemon - Warning Installed IoT Edge daemon has version 1.0.9.1 but 1.0.9 is the latest stable version available. Please see https://aka.ms/iotedge-update-runtime for update instructions. √ host time is close to real time - OK √ container time is close to host time - OK ‼ DNS server - Warning Container engine is not configured with DNS server setting, which may impact connectivity to IoT Hub. Please see https://aka.ms/iotedge-prod-checklist-dns for best practices. You can ignore this warning if you are setting DNS server per module in the Edge deployment. ‼ production readiness: certificates - Warning The Edge device is using self-signed automatically-generated development certificates. They will expire in 89 days (at 2020-08-11 09:51:11 UTC) causing module-to-module and downstream device communication to fail on an active deployment. After the certs have expired, restarting the IoT Edge daemon will trigger it to generate new development certs. Please consider using production certificates instead. See https://aka.ms/iotedge-prod-checklist-certs for best practices. √ production readiness: container engine - OK ‼ production readiness: logs policy - Warning Container engine is not configured to rotate module logs which may cause it run out of disk space. Please see https://aka.ms/iotedge-prod-checklist-logs for best practices. You can ignore this warning if you are setting log policy per module in the Edge deployment. × production readiness: Edge Agent's storage directory is persisted on the host filesystem - Error Could not check current state of edgeAgent container × production readiness: Edge Hub's storage directory is persisted on the host filesystem - Error Could not check current state of edgeHub container ```

Device Information

Runtime Versions

Additional Notes

From iotedge check, there seem to be a bug with the version recognition :)

Installed IoT Edge daemon has version 1.0.9.1 but 1.0.9 is the latest stable version available.

I am not sure where this comes from, its a fresh installation:

× config.yaml has correct URIs for daemon mgmt endpoint - Error
    Error: could not execute list-modules request: an error occurred trying to connect: Connection refused (os error 111)
darobs commented 4 years ago

Hello @MPapst,

This is telling me it was unable to connect to the TPM device. Does the iotedge user (default user for the daemon) have access to it?

MPapst commented 4 years ago

@darobs, you are rights... it was missing the rights for the tpm. For the records: I introduced a new tpm group that owns the tpm as a group (instead of the root group) and added the iotedge user to that group.

Isn't that something the installation script can/should take care of?

MPapst commented 4 years ago

Found it in the documentation - closing this issue