Azure / iotedge

The IoT Edge OSS project
MIT License
1.46k stars 461 forks source link

iotedge check with DPS & x.509 configuration returns "cannot resolve IoT Hub hostname" Errors #2033

Closed crpietschmann closed 4 years ago

crpietschmann commented 4 years ago

Expected Behavior

The /etc/iotedge/config.yaml is setup for DPS provisioning using x.509 certificates.

When running sudo iotedge check --verbose --iothub-hostname https://iothub-name.azure-devices.net (with my IoT Hub hostname) all checks should succeed without error.

Current Behavior

The iotedge check command is returning the following errors for the Connectivity Checks showing that the IoT Hub Hostname cannot be resolved, even though running the host IOT-HUB-HOSTNAME.azure-devices.net command for the IoT Hub resolves correctly.

Connectivity checks
-------------------
√ host can connect to and perform TLS handshake with DPS endpoint - OK
× host can connect to and perform TLS handshake with IoT Hub AMQP port - Error
    Could not connect to https://IOT-HUB-HOSTNAME.azure-devices.net:5671 : could not resolve hostname
        caused by: failed to lookup address information: Name or service not known
× host can connect to and perform TLS handshake with IoT Hub HTTPS /WebSockets port - Error
    Could not connect to https://IOT-HUB-HOSTNAME.azure-devices.net:443 : could not resolve hostname
        caused by: failed to lookup address information: Name or service not known
× host can connect to and perform TLS handshake with IoT Hub MQTT port - Error
    Could not connect to https://IOT-HUB-HOSTNAME.azure-devices.net:8883 : could not resolve hostname
        caused by: failed to lookup address information: Name or service not known
× container on the default network can connect to IoT Hub AMQP port - Error
    Container on the default network could not connect to https://IOT-HUB-HOSTNAME.azure-devices.net:5671
        caused by: docker returned exit code: 1, stderr = Error: could not resolve Azure IoT Hub hostname: failed to lookup address information: Name does not resolve
× container on the default network can connect to IoT Hub HTTPS / WebSockets port - Error
    Container on the default network could not connect to https://IOT-HUB-HOSTNAME.azure-devices.net:443
        caused by: docker returned exit code: 1, stderr = Error: could not resolve Azure IoT Hub hostname: failed to lookup address information: Name does not resolve
× container on the default network can connect to IoT Hub MQTT port - Error
    Container on the default network could not connect to https://IOT-HUB-HOSTNAME.azure-devices.net:8883
        caused by: docker returned exit code: 1, stderr = Error: could not resolve Azure IoT Hub hostname: failed to lookup address information: Name does not resolve
× container on the IoT Edge module network can connect to IoT Hub AMQP port - Error
    Container on the azure-iot-edge network could not connect to https://IOT-HUB-HOSTNAME.azure-devices.net:5671
        caused by: docker returned exit code: 1, stderr = Error: could not resolve Azure IoT Hub hostname: failed to lookup address information: Name does not resolve
× container on the IoT Edge module network can connect to IoT Hub HTTPS / WebSockets port - Error
    Container on the azure-iot-edge network could not connect to https://IOT-HUB-HOSTNAME.azure-devices.net:443
        caused by: docker returned exit code: 1, stderr = Error: could not resolve Azure IoT Hub hostname: failed to lookup address information: Name does not resolve
× container on the IoT Edge module network can connect to IoT Hub MQTT port - Error
    Container on the azure-iot-edge network could not connect to https://IOT-HUB-HOSTNAME.azure-devices.net:8883
        caused by: docker returned exit code: 1, stderr = Error: could not resolve Azure IoT Hub hostname: failed to lookup address information: Name does not resolve
× Edge Hub can bind to ports on host - Error
    Could not check current state of Edge Hub container
        caused by: docker returned exit code: 1, stderr = Error: No such object: edgeHub

10 check(s) succeeded.
3 check(s) raised warnings.
11 check(s) raised errors.

Steps to Reproduce

Provide a detailed set of steps to reproduce the bug.

  1. Update the `/etc/iotedge/config.yaml to match the following that configures x.509 certificates for DPS provisioning of the IoT Edge Device:
###############################################################################
#                      IoT Edge Daemon configuration
###############################################################################
#
# This file configures the IoT Edge daemon. The daemon must be restarted to
# pick up any configuration changes.
#
# Note - this file is yaml. Learn more here: http://yaml.org/refcard.html
#
###############################################################################

###############################################################################
# Provisioning mode and settings
###############################################################################
#
# Configures the identity provisioning mode of the daemon.
#
# Supported modes:
#     manual - using an iothub connection string
#     dps    - using dps for provisioning
#
# DPS Settings
#     scope_id - Required. Value of a specific DPS instance's ID scope
#     registration_id - Required. Registration ID of a specific devicein DPS
#     symmetric_key - Optional. This entry should only be specified when
#                     provisioning devices configured for symmetric key
#                     attestation
###############################################################################

# Manual provisioning configuration
# provisioning:
#   source: "manual"
#   device_connection_string: "<ADD DEVICE CONNECTION STRING HERE>"

provisioning:
  source: "dps"
  global_endpoint: "https://global.azure-devices-provisioning.net"
  scope_id: "0ne000XXXXX”
  attestation:
    method: "x509"
    identity_cert: “/directory/certs/iot-edge-device-ca-MyEdgeDeviceCA-full-chain.cert.pem"
    identity_pk: “/directory/private/iot-edge-device-ca-MyEdgeDeviceCA.key.pem"
  dynamic_reprovisioning: false

###############################################################################
# Certificate settings
###############################################################################
#
# Configures the certificates required to operate the IoT Edge
# runtime as a gateway which enables external leaf devices to securely
# communicate with the Edge Hub. If not specified, the required certificates
# are auto generated for quick start scenarios which are not intended for
# production environments.
#
# Settings:
#     device_ca_cert   - path to the device ca certificate and its chain
#     device_ca_pk     - path to the device ca private key file
#     trusted_ca_certs - path to a file containing all the trusted CA
#                        certificates required for Edge module communication
#
###############################################################################

certificates:
  device_ca_cert: “/directory/certs/iot-edge-device-ca-MyEdgeDeviceCA-full-chain.cert.pem"
  device_ca_pk: “/directory/private/iot-edge-device-ca-MyEdgeDeviceCA.key.pem"
  trusted_ca_certs: “/directory/certs/azure-iot-test-only.root.ca.cert.pem"

###############################################################################
# Edge Agent module spec
###############################################################################
#
# Configures the initial Edge Agent module.
#
# The daemon uses this definition to bootstrap the system. The Edge Agent can
# then update itself based on the Edge Agent module definition presentin the
# deployment in IoT Hub.
#
# Settings:
#     name     - name of the edge agent module. Expected to be "edgeAgent".
#     type     - type of module. Always "docker".
#     env      - Any environment variable that needs to be set for edge agent module.
#     config   - type specific configuration for edge agent module.
#       image  - (docker) Modules require a docker image tag.
#       auth   - (docker) Modules may need authoriation to connect to container registry.
#
# Adding environment variables:
# replace "env: {}" with
#  env:
#    key: "value"
#
# Adding container registry authorization:
# replace "auth: {}" with
#    auth:
#      username: "username"
#      password: "password"
#      serveraddress: "serveraddress"
#
###############################################################################

agent:
  name: "edgeAgent"
  type: "docker"
  env: {}
  config:
    image: "mcr.microsoft.com/azureiotedge-agent:1.0"
    auth: {}

###############################################################################
# Edge device hostname
###############################################################################
#
# Configures the environment variable 'IOTEDGE_GATEWAYHOSTNAME' injected into
# modules. Regardless of case the hostname is specified below, a lowercase
# value is used to configure the Edge Hub server hostname as well as the
# environment variable specified above.
#
# It is important to note that when connecting downstream devices to the
# Edge Hub that the lower case value of this hostname be used in the
# 'GatewayHostName' field of the device's connection string URI.
###############################################################################

hostname: "AzureIoTEdgeGatewayLinuxVM"

###############################################################################
# Watchdog settings
###############################################################################
#
# The IoT edge daemon has a watchdog that periodically checks the health of the
# Edge Agent module and restarts it if it's down.
#
# max_retries - Configures the number of retry attempts that the IoT edge daemon
#               should make for failed operations before failing with a fatal error.
#
#               If this configuration is not specified, the daemon keeps retrying
#               on errors and doesn't fail fatally.
#
#               On a fatal failure, the daemon returns an exit code which
#               signifies the kind of error encountered. Currently, the following
#               error codes are returned by the daemon -
#
#               150 - Invalid Device ID specified.
#               151 - Invalid IoT hub configuration.
#               152 - Invalid SAS token used to call IoT hub.
#                     This could signal an invalid SAS key.
#               1 - All other errors.
###############################################################################

#watchdog:
#  max_retries: 2

###############################################################################
# Connect settings
###############################################################################
#
#
#Configures URIs used by clients of the management and workload APIs
#     management_uri - used by the Edge Agent and 'iotedge' CLI to start,
#                      stop, and manage modules
#     workload_uri   - used by modules to retrieve tokens and certificates
#
# The following uri schemes are supported:
#     http - connect over TCP
#     unix - connect over Unix domain socket
#
###############################################################################

connect:
  management_uri: "unix:///var/run/iotedge/mgmt.sock"
  workload_uri: "unix:///var/run/iotedge/workload.sock"

###############################################################################
# Listen settings
###############################################################################
#
# Configures the listen addresses for the daemon.
#     management_uri - used by the Edge Agent and 'iotedge' CLI to start,
#                      stop, and manage modules
#     workload_uri   - used by modules to retrieve tokens and certificates
#
# The following uri schemes are supported:
#     http - listen over TCP
#     unix - listen over Unix domain socket
#     fd   - listen using systemd socket activation
#
# These values can be different from the connect URIs. For instance, when
# using the fd:// scheme for systemd:
#     listen address is fd://iotedge.workload,
#     connect address is unix:///var/run/iotedge/workload.sock
#
###############################################################################

listen:
  management_uri: "fd://iotedge.mgmt.socket"
  workload_uri: "fd://iotedge.socket"

###############################################################################
# Home Directory
###############################################################################
#
# Configures the home directory for the daemon.
#
###############################################################################

homedir: "/var/lib/iotedge"

###############################################################################
# Moby Container Runtime settings
###############################################################################
#
# uri - configures the uri for the container runtime.
# network - configures the network on which the containers will be created.
#
###############################################################################

moby_runtime:
  uri: "unix:///var/run/docker.sock"
#   network: "azure-iot-edge"
  1. Run the following command:
sudo iotedge check --verbose --iothub-hostname https://iothub-name.azure-devices.net

Context (Environment)

Output of iotedge check

Click here ``` Configuration checks -------------------- √ config.yaml is well-formed - OK × config.yaml has well-formed connection string - Error Device is not using manual provisioning, so Azure IoT Hub hostname needs to be specified with --iothub-hostname √ container engine is installed and functional - OK ‼ config.yaml has correct hostname - Warning config.yaml has hostname AzureIoTEdgeGatewayLinuxVM which does not comply with RFC 1035. - Hostname must be between 1 and 255 octets inclusive. - Each label in the hostname (component separated by ".") must be between 1 and 63 octets inclusive. - Each label must start with an ASCII alphabet character (a-z), end with an ASCII alphanumeric character (a-z, 0-9), and must contain only ASCII alphanumeric characters or hyphens (a-z, 0-9, "-"). Not complying with RFC 1035 may cause errors during the TLShandshake with modules and downstream devices. × config.yaml has correct URIs for daemon mgmt endpoint - Error Error: could not execute list-modules request: an error occurred trying to connect: Connection refused (os error 111) caused by: docker returned exit code: 1, stderr = Error: could not execute list-modules request: an error occurred trying to connect: Connection refused (os error 111) √ latest security daemon - OK √ host time is close to real time - OK √ container time is close to host time - OK ‼ DNS server - Warning Container engine is not configured with DNS server setting,which may impact connectivity to IoT Hub. Please see https://aka.ms/iotedge-prod-checklist-dns for best practices. You can ignore this warning if you are setting DNS server per module in the Edge deployment. caused by: Could not open container engine config file /etc/docker/daemon.json caused by: No such file or directory (os error 2) √ production readiness: certificates - OK √ production readiness: certificates expiry - OK √ production readiness: container engine - OK ‼ production readiness: logs policy - Warning Container engine is not configured to rotate module logs which may cause it run out of disk space. Please see https://aka.ms/iotedge-prod-checklist-logs for best practices. You can ignore this warning if you are setting log policy per module in the Edge deployment. caused by: Could not open container engine config file /etc/docker/daemon.json caused by: No such file or directory (os error 2) Connectivity checks ------------------- √ host can connect to and perform TLS handshake with DPS endpoint - OK ‼ host can connect to and perform TLS handshake with IoT Hub AMQP port - Warning skipping because of previous failures ‼ host can connect to and perform TLS handshake with IoT Hub HTTPS / WebSockets port - Warning skipping because of previous failures ‼ host can connect to and perform TLS handshake with IoT Hub MQTT port - Warning skipping because of previous failures ‼ container on the default network can connect to IoT Hub AMQP port - Warning skipping because of previous failures ‼ container on the default network can connect to IoT Hub HTTPS/ WebSockets port - Warning skipping because of previous failures ‼ container on the default network can connect to IoT Hub MQTT port - Warning skipping because of previous failures ‼ container on the IoT Edge module network can connect to IoT Hub AMQP port - Warning skipping because of previous failures ‼ container on the IoT Edge module network can connect to IoT Hub HTTPS / WebSockets port - Warning skipping because of previous failures ‼ container on the IoT Edge module network can connect to IoT Hub MQTT port - Warning skipping because of previous failures × Edge Hub can bind to ports on host - Error Could not check current state of Edge Hub container caused by: docker returned exit code: 1, stderr = Error: No such object: edgeHub 9 check(s) succeeded. 3 check(s) raised warnings. 3 check(s) raised errors. 9 check(s) were skipped due to errors from other checks. ```

Device Information

The host OS I'm using for this IoT Edge Device is the Azure IoT Edge on Ubuntu VM image from the Azure Marketplace.

Runtime Versions

Note: when using Windows containers on Windows, run docker -H npipe:////./pipe/iotedge_moby_engine version instead

Logs

iotedged logs ``` ```
edge-agent logs ``` ```
edge-hub logs ``` ```

Additional Information

N/A

crpietschmann commented 4 years ago

It looks like this functionality is planned for v1.0.9 release, as per this link: https://github.com/Azure/iotedge/blob/master/doc/rc/how-to-auto-provision-x509-certificates.md#create-and-provision-an-iot-edge-device-using-x509-certificates-preview

Guess I'll close this Issue and wait until the 1.0.9 release.