Azure / iotedge

The IoT Edge OSS project
MIT License
1.47k stars 462 forks source link

Edge CA certificate does not renew and SimulatedTemperature fails to connect #5788

Open curua2008 opened 3 years ago

curua2008 commented 3 years ago

Expected Behavior

When using EST certificate providers like GlobalSign or DigiCert, IoT Edge should renew the Edge CA certificate when it expires.

This happens when we have the device certificate setting in confim.toml as show below:

[edge_ca]
method = "est"
url = "https://xxxdevca.est.edge.dev.globalsign.com:443/.well-known/est/"`

Current Behavior

iotedge was able to obtain a Edge ca certificate from EST server as shown below

Edge CA cert expired after 2 days image

No certificate renewal happens

a MS SimulatedTemperatureSensor module which was deployed before the certificate expired, it was working but failed to connect after the cert expired

image

Steps to Reproduce

Provide a detailed set of steps to reproduce the bug. 1.Set up an EST server with Edge ca cert expires in 2 days

  1. Configure IotEdge using a config file similar to the one below
[provisioning]
source = "dps"
global_endpoint = "https://global.azure-devices-provisioning.net/"
id_scope = "0ne0xxxxA7F0"

[provisioning.attestation]
method = "x509"
registration_id = "xxx-bootstrap-globalsign-vnm2"

identity_cert = { method = "est", common_name = "xxxxootstrap-globalsign-vnm2", url ="https://xxx.est.edge.dev.globalsign.com:443/.well-known/est/" }

[cert_issuance.est]
trusted_certs = [
        "file:///var/secrets/globalsign_root.pem",
]

[cert_issuance.est.auth]
#username = "xxx"
#password = "xx"

bootstrap_identity_cert = "file:///var/secrets/xxx_bootstrap_gs.pem"
bootstrap_identity_pk = "pkcs11:token=IoTEdgeCert;object=bootstrap-rsa-pair?pin-value=xxx" # PKCS#11 URI

[cert_issuance.est.urls]
#default = "https://xxxiden.est.edge.dev.globalsign.com:443/.well-known/est/"

[aziot_keys]
pkcs11_lib_path = "/usr/local/lib/libtpm2_pkcs11.so"
#pkcs11_base_slot = "pkcs11:token=IoTEdgeCert?pin-value=xxx"

[edge_ca]
method = "est"
url = "https://xxxdevca.est.edge.dev.globalsign.com:443/.well-known/est/"
  1. Deploy MS Simulated Temperature Sensor module
  2. Note the behavior after Edge Ca cert expires

Context (Environment)

Host OS [e.g. Ubuntu 18.04, Windows Server IoT 2019]: Ubuntu 18.04 Architecture [e.g. amd64, arm32, arm64]: amd64 Container OS [e.g. Linux containers, Windows containers]: Linux

Output of iotedge check

Configuration checks (aziot-identity-service)
---------------------------------------------
√ keyd configuration is well-formed - OK
√ certd configuration is well-formed - OK
√ tpmd configuration is well-formed - OK
√ identityd configuration is well-formed - OK
√ daemon configurations up-to-date with config.toml - OK
√ identityd config toml file specifies a valid hostname - OK
‼ aziot-identity-service package is up-to-date - Warning
    Installed aziot-identity-service package has version 1.3.0 but 1.2.3 is the latest stable version available.
    Please see https://aka.ms/aziot-update-runtime for update instructions.
√ host time is close to reference time - OK
× production readiness: identity certificates expiry - Error
    DPS identity 'device-id' expired at 2021-10-30 20:36:10 UTC
× production readiness: EST identity and bootstrap certificates expiry - Error
    x509 identity 'est-id' expired at 2021-10-30 20:36:09 UTC
√ preloaded certificates are valid - OK
√ keyd is running - OK
√ certd is running - OK
√ identityd is running - OK
√ read all preloaded certificates from the Certificates Service - OK
√ read all preloaded key pairs from the Keys Service - OK
√ ensure all preloaded certificates match preloaded private keys with the same ID - OK

Connectivity checks (aziot-identity-service)
--------------------------------------------
‼ host can connect to and perform TLS handshake with iothub AMQP port - Warning
    Could not retrieve iothub_hostname from provisioning file.
    Please specify the backing IoT Hub name using --iothub-hostname switch if you have that information.
    Since no hostname is provided, all hub connectivity tests will be skipped.
‼ host can connect to and perform TLS handshake with iothub HTTPS / WebSockets port - Warning
    Could not retrieve iothub_hostname from provisioning file.
    Please specify the backing IoT Hub name using --iothub-hostname switch if you have that information.
    Since no hostname is provided, all hub connectivity tests will be skipped.
‼ host can connect to and perform TLS handshake with iothub MQTT port - Warning
    Could not retrieve iothub_hostname from provisioning file.
    Please specify the backing IoT Hub name using --iothub-hostname switch if you have that information.
    Since no hostname is provided, all hub connectivity tests will be skipped.
√ host can connect to and perform TLS handshake with DPS endpoint - OK

Configuration checks
--------------------
√ aziot-edged configuration is well-formed - OK
√ configuration up-to-date with config.toml - OK
√ container engine is installed and functional - OK
× configuration has correct URIs for daemon mgmt endpoint - Error
    Unable to find image 'mcr.microsoft.com/azureiotedge-diagnostics:1.2.420211006.4' locally
    docker: Error response from daemon: manifest for mcr.microsoft.com/azureiotedge-diagnostics:1.2.420211006.4 not found: manifest unknown: manifest tagged by "1.2.420211006.4" is not found.
    See 'docker run --help'.
‼ aziot-edge package is up-to-date - Warning
    Installed IoT Edge daemon has version 1.2.420211006.4 but 1.2.4 is the latest stable version available.
    Please see https://aka.ms/iotedge-update-runtime for update instructions.
× container time is close to host time - Error
    Could not query local time inside container
‼ DNS server - Warning
    Container engine is not configured with DNS server setting, which may impact connectivity to IoT Hub.
    Please see https://aka.ms/iotedge-prod-checklist-dns for best practices.
    You can ignore this warning if you are setting DNS server per module in the Edge deployment.
√ production readiness: container engine - OK
‼ production readiness: logs policy - Warning
    Container engine is not configured to rotate module logs which may cause it run out of disk space.
    Please see https://aka.ms/iotedge-prod-checklist-logs for best practices.
    You can ignore this warning if you are setting log policy per module in the Edge deployment.
‼ production readiness: Edge Agent's storage directory is persisted on the host filesystem - Warning
    The edgeAgent module is not configured to persist its /tmp/edgeAgent directory on the host filesystem.
    Data might be lost if the module is deleted or updated.
    Please see https://aka.ms/iotedge-storage-host for best practices.
‼ production readiness: Edge Hub's storage directory is persisted on the host filesystem - Warning
    The edgeHub module is not configured to persist its /tmp/edgeHub directory on the host filesystem.
    Data might be lost if the module is deleted or updated.
    Please see https://aka.ms/iotedge-storage-host for best practices.

Connectivity checks
-------------------
19 check(s) succeeded.
9 check(s) raised warnings. Re-run with --verbose for more details.
4 check(s) raised errors. Re-run with --verbose for more details.
7 check(s) were skipped due to errors from other checks. Re-run with --verbose for more details.

Device Information

Host OS [e.g. Ubuntu 18.04, Windows Server IoT 2019]: Ubuntu 18.04 Architecture [e.g. amd64, arm32, arm64]: amd64 Container OS [e.g. Linux containers, Windows containers]: Linux

Runtime Versions

iotedge 1.2.420211006.4

aziot-edged [run iotedge version]: https://github.com/Azure/iot-identity-service/suites/3964124249/artifacts/99607813 Edge Agent [image tag (e.g. 1.0.0)]: Edge Hub [image tag (e.g. 1.0.0)]: Docker/Moby [run docker version]:

Note: when using Windows containers on Windows, run docker -H npipe:////./pipe/iotedge_moby_engine version instead

Logs

aziot-edged logs ``` ```
edge-agent logs ``` ```
edge-hub logs ``` ```

Additional Information

Please provide any additional information that may be helpful in understanding the issue.

onalante-msft commented 3 years ago

At the moment, the edge CA certificate is only checked for expiry when a deployment is made. See issue_cert and check_edge_ca in master; and refresh_cert and prepare_edge_ca in 1.2. certd does not do lifecycle management, and instead defers that task to service consumers. I believe we are considering automating certificate lifecycle management since it is clearly possible to not have a deployment within the certificate renewal window.

curua2008 commented 3 years ago

@onalante-msft Thanks for the comment. When you mentioned "defers that task to service consumers". do we have to keep track of the edge ca certificate expiration date, manually delete the expired cert and restart iotedge to get a new valid cert?

maksokami commented 3 years ago

Response here suggests the opposite - that iotedge is supposed to manage certificates on a basic level (renewal). https://github.com/Azure/iot-identity-service/issues/300#issuecomment-946018542

At the moment, the edge CA certificate is only checked for expiry when a deployment is made. See issue_cert and check_edge_ca in master; and refresh_cert and prepare_edge_ca in 1.2. certd does not do lifecycle management, and instead defers that task to service consumers. I believe we are considering automating certificate lifecycle management since it is clearly possible to not have a deployment within the certificate renewal window.

onalante-msft commented 3 years ago

Sorry, I should have been more clear. In this case, I would refer to identityd as a consumer of certd. identityd, as the device-side source of truth for module identities (and hence certificates), is responsible for managing the lifecycle of module identities as far as certd is concerned. Module developers should not need to manage the edge CA lifecycle, and can fully expect identityd to update module identities when appropriate. I do not think adding this feature to identityd is a specific tracked item at the moment, but I can probably look into it since I am working in a conceptually adjacent area.

github-actions[bot] commented 2 years ago

This issue is being marked as stale because it has been open for 30 days with no activity.

vjrantal commented 2 years ago

I do not think adding this feature to identityd is a specific tracked item at the moment

@onalante-msft @pmzara Would it be possible to create an issue in the identity service repository to track this feature?

jlian commented 2 years ago

Hi folks, just want to let everyone know that we're actively working on this and hope to include the feature in an upcoming release soon.

jlian commented 2 years ago

Same applies to https://github.com/Azure/iotedge/issues/5787