Azure / azure-iot-cli-extension

Azure IoT extension for Azure CLI
Other
83 stars 65 forks source link

[question/bug] `az iot hub monitor-events` stops directly after startup #706

Closed nilskosek closed 1 week ago

nilskosek commented 3 months ago

Describe the bug When running the az iot hub monitor-events -n <my-iothub-name> --device-id <my-devide-id> --content-type application/json --timeout 10 --properties app it will stop directly after execution without any Error. While running the same command with --debug flag, I can see only one error: cli.azext_iot.monitor.telemetry: No Event Hub partitions found to listen on. or. i this the only hint I can give for now. I already found a workaround for it. Now I will not use anymore the -n flag, instead I use --login 'HostName=XXXX.azure-devices.net;SharedAccessKeyName=XXXX;SharedAccessKey=XXXX'.

So maybe you could help me, why the first command isn't working anymore. Any help is appreciated.

To Reproduce Steps to reproduce the behavior:

  1. Open Bash
  2. az login (to login into your azure account)
  3. az iot hub monitor-events -n <my-iothub-name> --device-id <my-devide-id> --content-type application/json --timeout 10 --properties app
  4. Monitoring will start but stops directly after

Expected behavior IotHub monitoring (for a given device-id) is started and will provide all incoming messages. Are there no messages within 10 seconds (after a received message) the monitoring will stop.

Screenshots / `` Environment:

Additional context /

c-ryan-k commented 3 months ago

Thanks @nilskosek for bringing this to our attention. I did notice a single similar failure this morning from our nightly pipelines, I'm running another few tests to determine if this is just a service interruption or if it's a bug we need to patch.

Thank you also for providing a workaround - in the meantime would you mind trying again with -n to see if you can still reproduce the issue?

nilskosek commented 3 months ago

Hey @c-ryan-k, yes I can still reproduce the issue.

stefanb2 commented 2 months ago

I can confirm this bug with that same error message.

@c-ryan-k this is a critical problem, because I have to use this command several times a day for my development work.

For me az iot hub monitor-events broke permanently with that error message for my development Azure IoT Hub, while I was developing a Python script that uses Azure SDK for Python's azure-eventhub package to capture events from that hub. It still works fine for the other two Azure IoT Hubs I have access to.

Of course the problem might actually be the Azure Event Hub compatible endpoint on the Azure IoT Hub side. But the command doesn't print out any details what's actually wrong with the endpoint.

@nilskosek thanks for pointing out the workaround. I can confirm that with --login instead of --subscription the command works again.

vilit1 commented 2 months ago

Hi,

I will be helping out @c-ryan-k. We unfortunately cannot reproduce your error still and would like some more information:

  1. Can you provide us the output of az iot hub show (please remove hub name, resource group, subscription, and any other fields that would have personal information).

  2. What Shared Access Policies do you have?

Thanks!

stefanb2 commented 2 months ago

Output from az iot hub show:

{
  "etag": "CgAAACAAAAA=",
  "id": "<REDACTED>",
  "identity": {
    "principalId": "<REDACTED>",
    "tenantId": "<REDACTED>",
    "type": "SystemAssigned",
    "userAssignedIdentities": null
  },
  "location": "northeurope",
  "name": "<REDACTED>",
  "properties": {
    "allowedFqdnList": [],
    "authorizationPolicies": null,
    "cloudToDevice": {
      "defaultTtlAsIso8601": "1:00:00",
      "feedback": {
        "lockDurationAsIso8601": "0:01:00",
        "maxDeliveryCount": 10,
        "ttlAsIso8601": "1:00:00"
      },
      "maxDeliveryCount": 10
    },
    "comments": null,
    "deviceStreams": null,
    "disableDeviceSas": null,
    "disableLocalAuth": null,
    "disableModuleSas": null,
    "enableDataResidency": null,
    "enableFileUploadNotifications": false,
    "encryption": null,
    "eventHubEndpoints": {
      "events": {
        "endpoint": "sb://iothub-ns-<REDACTED>-iot-<REDACTED>.servicebus.windows.net/",
        "partitionCount": 4,
        "partitionIds": [],
        "path": "<REDACTED>",
        "retentionTimeInDays": 1
      }
    },
    "features": "RootCertificateV2",
    "hostName": "<REDACTED>.azure-devices.net",
    "ipFilterRules": [],
    "ipVersion": null,
    "locations": [
      {
        "location": "North Europe",
        "role": "primary"
      },
      {
        "location": "West Europe",
        "role": "secondary"
      }
    ],
    "messagingEndpoints": {
      "fileNotifications": {
        "lockDurationAsIso8601": "0:01:00",
        "maxDeliveryCount": 10,
        "ttlAsIso8601": "1:00:00"
      }
    },
    "minTlsVersion": null,
    "networkRuleSets": null,
    "privateEndpointConnections": null,
    "provisioningState": "Succeeded",
    "publicNetworkAccess": null,
    "restrictOutboundNetworkAccess": null,
    "rootCertificate": {
      "enableRootCertificateV2": true,
      "lastUpdatedTimeUtc": "2023-05-30T19:03:54.730159+00:00",
      "microsoftMigratedRootCert": false
    },
    "routing": {
      "endpoints": {
        "cosmosDbSqlContainers": [],
        "eventHubs": [],
        "serviceBusQueues": [
          {
            "authenticationType": null,
            "connectionString": "<REDACTED>",
            "endpointUri": null,
            "entityPath": null,
            "id": "<REDACTED>",
            "identity": null,
            "name": "status-sb-queue",
            "resourceGroup": "<REDACTED>",
            "subscriptionId": "<REDACTED>"
          }
        ],
        "serviceBusTopics": [
          {
            "authenticationType": null,
            "connectionString": "<REDACTED>",
            "endpointUri": null,
            "entityPath": null,
            "id": "<REDACTED>",
            "identity": null,
            "name": "measuremet-topic",
            "resourceGroup": "<REDACTED>",
            "subscriptionId": "<REDACTED>"
          }
        ],
        "storageContainers": []
      },
      "enrichments": null,
      "fallbackRoute": {
        "condition": "true",
        "endpointNames": [
          "events"
        ],
        "isEnabled": true,
        "name": "$fallback",
        "source": "DeviceMessages"
      },
      "routes": [
        {
          "condition": "true\n",
          "endpointNames": [
            "measuremet-topic"
          ],
          "isEnabled": true,
          "name": "measurement-to-sb-topic",
          "source": "DeviceMessages"
        },
        {
          "condition": "true",
          "endpointNames": [
            "events"
          ],
          "isEnabled": true,
          "name": "To_default_endpoint",
          "source": "DeviceMessages"
        }
      ]
    },
    "state": "Active",
    "storageEndpoints": {
      "$default": {
        "authenticationType": "keyBased",
        "connectionString": "",
        "containerName": "",
        "identity": null,
        "sasTtlAsIso8601": "1:00:00"
      }
    }
  },
  "resourcegroup": "<REDACTED>",
  "sku": {
    "capacity": 1,
    "name": "S1",
    "tier": "Standard"
  },
  "subscriptionid": "<REDACTED>",
  "systemData": {
    "createdAt": "2021-01-18T08:35:56.487000+00:00",
    "createdBy": null,
    "createdByType": null,
    "lastModifiedAt": null,
    "lastModifiedBy": null,
    "lastModifiedByType": null
  },
  "tags": {},
  "type": "Microsoft.Devices/IotHubs"
}

Output from az iot hub policy list:

[
  {
    "keyName": "iothubowner",
    "primaryKey": "<REDACTED>",
    "rights": "RegistryWrite, ServiceConnect, DeviceConnect",
    "secondaryKey": "<REDACTED>"
  },
  {
    "keyName": "service",
    "primaryKey": "<REDACTED>",
    "rights": "RegistryWrite, ServiceConnect",
    "secondaryKey": "<REDACTED>"
  },
  {
    "keyName": "device",
    "primaryKey": "<REDACTED>",
    "rights": "DeviceConnect",
    "secondaryKey": "<REDACTED>"
  },
  {
    "keyName": "registryRead",
    "primaryKey": "<REDACTED>",
    "rights": "RegistryRead",
    "secondaryKey": "<REDACTED>"
  },
  {
    "keyName": "registryReadWrite",
    "primaryKey": "<REDACTED>",
    "rights": "RegistryWrite, ServiceConnect",
    "secondaryKey": "<REDACTED>"
  },
  {
    "keyName": "backend_access",
    "primaryKey": "<REDACTED>",
    "rights": "RegistryWrite, ServiceConnect",
    "secondaryKey": "<REDACTED>"
  },
  {
    "keyName": "antenna-tests",
    "primaryKey": "<REDACTED>",
    "rights": "RegistryRead, ServiceConnect",
    "secondaryKey": "<REDACTED>"
  }
]

The command az iot hub monitor-events chooses the policy iothubowner:

cli.azure.cli.core.sdk.policies: Response content:
cli.azure.cli.core.sdk.policies: {"value":[{"keyName":"iothubowner", ... <REDACTED> ...}]}
cli.azext_iot.common.base_discovery: Using policy 'iothubowner' for IoT Hub interaction.
Starting event monitor, use ctrl-c to stop...
cli.azext_iot.monitor.telemetry: No Event Hub partitions found to listen on.
vilit1 commented 2 months ago

I see that there is a partition count but no partition id's which is weird: image

If you provide login, we do not fetch the hub information (as you see from az iot hub show) and instead "guess" what partition id's are available (based on the partition count). This fills in this empty list and allows the command to succeed. I will add in some fail-safes to ensure that we are not fetching an empty partition id list (and throw a useful error if there are no partitions) and double check if this is expected behavior on the hub side.

stefanb2 commented 2 months ago

Strangely enough azure-eventhub doesn't seem to have a problem to detect the partitions on that same hub. I.e. the Python code I wrote to capture events from the IoT Hub continues to work fine.

Excerpt from my test code

import logging
from azure.eventhub.aio   import EventHubConsumerClient
...
        try:
            self._client = EventHubConsumerClient.from_connection_string(
                conn_str=event_hub_connection_string,
                consumer_group="$default",
                uamqp_transport=True,
            )
        except Exception as e:
            fatal('Event Hub consumer client creation failed', e)
...
        self._logger    = logging.getLogger(__class__.__name__)
...
        for id in await self._client.get_partition_ids():
            self._logger.info(id)
            part = await self._client.get_partition_properties(id)
            self._logger.info(part)
        await self._client.close()
INFO:xxx:0
INFO:xxx:{'eventhub_name': '<REDACTED>, 'id': '0', 'beginning_sequence_number': 2052919, 'last_enqueued_sequence_number': 2062152, 'last_enqueued_offset': '3234134055352', 'is_empty': False, 'last_enqueued_time_utc': datetime.datetime(2024, 7, 11, 5, 44, 57, 249000, tzinfo=datetime.timezone.utc)}
INFO:xxx:1
INFO:xxx:{'eventhub_name': '<REDACTED>', 'id': '1', 'beginning_sequence_number': 1038378, 'last_enqueued_sequence_number': 1043256, 'last_enqueued_offset': '2147514888752', 'is_empty': False, 'last_enqueued_time_utc': datetime.datetime(2024, 7, 11, 5, 45, 4, 193000, tzinfo=datetime.timezone.utc)}
INFO:xxx:2
INFO:xxx:{'eventhub_name': '<REDACTED>', 'id': '2', 'beginning_sequence_number': 1536098, 'last_enqueued_sequence_number': 1540890, 'last_enqueued_offset': '2757397051176', 'is_empty': False, 'last_enqueued_time_utc': datetime.datetime(2024, 7, 11, 5, 44, 55, 698000, tzinfo=datetime.timezone.utc)}
INFO:xxx:3
INFO:xxx:{'eventhub_name': '<REDACTED>', 'id': '3', 'beginning_sequence_number': 754285, 'last_enqueued_sequence_number': 758972, 'last_enqueued_offset': '1713722268456', 'is_empty': False, 'last_enqueued_time_utc': datetime.datetime(2024, 7, 11, 5, 44, 49, 196000, tzinfo=datetime.timezone.utc)}
nilskosek commented 1 week ago

Thanks @vilit1 for providing this fix . I just saw that you're released Version 0.25.0 of azure-iot-cli-extension last week (that included this fix). I've tested it on my machine and now it's working as expected again. From my perspective we could close this Issue, if nobody else have any further points about it? If so, should I or you close the issue?

stefanb2 commented 1 week ago

I can confirm that monitor-events now works for me again on the "broken" Azure IoT Hub. Unfortunately I forgot to check before upgrading the extension, so I can't be sure if it was fixed by the upgrade or something else.

vilit1 commented 1 week ago

I will close the issue - if this error shows up again, feel free to open a new issue.