Azure / iot-identity-service

Source of the Azure IoT Identity Service and related services.
MIT License
37 stars 45 forks source link

Using idtype module seems to recreate the module identity in the cloud #600

Closed FinitelyFailed closed 3 months ago

FinitelyFailed commented 3 months ago

Hi,

We are using azure identity services version 1.4.7:

# aziotctl --version
aziotctl 1.4.7

And azure device update 1.0.2:

# /usr/bin/AducIotAgent -v
1.0.2

We are trying to setup ADU to use a module identity. As we have our own service which also connection to the IoT Hub using AIS. If borth ADU and our own service are setup to be run as "device" they fight over which should use the IoT Hub device. So we have chosen to use ADU as a module identity.

So I have configured identity service like this:

# cat /etc/aziot/identityd/config.d/adu.toml      
[[principal]]
uid = 800
name = "DeviceUpdate"
idtype = ["module"]

Which seems to work.

We are using ADU to perform an A-B update of the device. We have these partitions:

We are running our OS on the rootA partition. And saving our config for AIS on a data partition, so that the config is not lost during the update. The config contains the connection string for the device on the IoT hub. The config which is used to setup ADU to use an Module Identity is located in "/etc/aziot/identityd/config.d/adu.toml" in our OS (i.e. on both rootA and later rootB).

When ADU performs the update (with the help of swupdate) a new version of our OS is installed on rootB. And a flag is set on the boot partition, so that when a reboot is performed rootB is booted instead of rootA. Than it reboots.

After a reboot the command "aziotctl config apply -c /data/aziot/config.toml" is performed, so that AIS can connect to the device in the IoT Hub.

Now to the problem after that command has been performed the Module Identity for our device in the IoT hub seems to be recreated. As nothing we put on the Module Identities twin remains.

We need to be able to set tags on the Modules Identity twin and make them stick. One of the tags is which ADUGroup this device belongs to.

According to this documentation: "If the idtype contains module, the name will be used by the Identity Service to generate new cloud identities, if not created already.". It should work as we have a module identity with the same name, but after an update is performed the twin is missing our tags.

gordonwang0 commented 3 months ago

Could you provide logs from the startup of aziot-identityd after the update? Those should show which modules identities are being modified.

Also, are you persisting /var/lib/aziot across updates? If not, aziot-identity-service will have an empty state after the update, which may cause it to think that certain module identities don't exist.

FinitelyFailed commented 3 months ago

Could you provide logs from the startup of aziot-identityd after the update? Those should show which modules identities are being modified.

The log from aziot-identityd is this: Mar 20 10:19:41 raspberrypi4-64 systemd[1]: Started Azure IoT Identity Service.

Is there a way to set it to verbose logging?

Also, are you persisting /var/lib/aziot across updates? If not, aziot-identity-service will have an empty state after the update, which may cause it to think that certain module identities don't exist.

We do not persist /var/lib/aziot across updates.

If I do:

aziotctl system stop
systemctl stop deviceupdate-agent
rm /etc/aziot/identityd/config.d/00-super.toml
rm -rf /var/lib/aziot/identityd/*
rm -rf /var/lib/aziot/keyd/*

Add something to the Module Identity twin. And run:

aziotctl system restart
systemctl start deviceupdate-agent
aziotctl config apply

The Module Identity is recreated and all the things I added is gone.

This is the log from deviceupdate-agent:

Mar 20 10:19:46 raspberrypi4-64 systemd[1]: Started Device Update Agent daemon..
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:46.1635Z 12176[12176] [D] Log file created: /data/adu/logs/du-agent.20240320-101946.log [zlog_init]
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:46.1646Z 12176[12176] [I] Agent (linux; 1.0.2) starting. [main]
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:46.1646Z 12176[12176] [I] Git Info: main:2d1f0367 [main]
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:46.1646Z 12176[12176] [I] Supported Update Manifest version: min: 4, max: 5 [main]
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:46.3799Z 12176[12176] [I] Health check passed. [HealthCheck]
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:46.5995Z 12176[12176] [I] Initializing PnP components. [ADUC_PnP_Components_Create]
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:46.5996Z 12176[12176] [I] ADUC agent started. Using IoT Hub Client SDK 1.9.0 [AzureDeviceUpdateCoreInterface_Create]
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:46.5996Z 12176[12176] [I] Calling ADUC_RegisterPlatformLayer [ADUC_MethodCall_Register]
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:46.5997Z 12176[12176] [I] Loading extension 'Content Downloader'. Reg file : /var/lib/adu/extensions/content_downloader/extension.json [LoadExtensionLibrary]
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:46.6022Z 12176[12176] [D] Determining contract version for content downloader. [LoadContentDownloaderLibrary]
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:46.6022Z 12176[12176] [D] Got Contract 1.0 for content downloader [LoadContentDownloaderLibrary]
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:46.6028Z 12176[12176] [I] Initializing command listener thread [InitializeCommandListenerThread]
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:46.6030Z 12176[12176] [I] Command register at slot#0 [RegisterCommand]
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:46.6030Z 12176[12176] [I] Agent running. [main]
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:46.6030Z 12176[12177] [I] Command FIFO file created successfully. [TryCreateFIFOPipe]
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:46.8667Z 12176[12176] [I] Attempting to create connection to IotHub using type: ADUC_ConnType_Module  [ADUC_DeviceClient_Create]
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:46.8668Z 12176[12176] [I] IotHub Protocol: MQTT [GetIotHubProtocolFromConfig]
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:46.8669Z 12176[12176] [I] IoTHub Device Twin callback registered. [ADUC_DeviceClient_Create]
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:46.8669Z 12176[12176] [I] Refreshing the handle for the PnP channels. [ADUC_PnP_Components_HandleRefresh]
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:46.8669Z 12176[12176] [I] Successfully re-authenticated the IoT Hub connection. [ADUC_Refresh_IotHub_Connection_SAS_Token]
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:47.0665Z 12176[12176] [I] The connection is currently broken. Will try to authenticate in 17 seconds. [Connection_Maintenance]
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: -> 10:19:47 CONNECT | VER: 4 | KEEPALIVE: 240 | FLAGS: 192 | USERNAME: syscomiotedgehubtest.azure-devices.net/tmm-2024-03-20/DeviceUpdate/?api-version=2020-09-30&DeviceClientType=iothubclient%2f1.9.0%20(native%3b%20Linux%3b%20aarch64)&model-id=dtmi%3aazure%3aiot%3adeviceUpdateModel%3b2 | PWD: XXXX | CLEAN: 0
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: <- 10:19:47 CONNACK | SESSION_PRESENT: false | RETURN_CODE: 0x0
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:47.2737Z 12176[12176] [D] IotHub connection status: 0, reason: 6 [IoTHub_CommunicationManager_ConnectionStatus_Callback]
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: -> 10:19:47 SUBSCRIBE | PACKET_ID: 2 | TOPIC_NAME: $iothub/twin/res/# | QOS: 0
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: <- 10:19:47 SUBACK | PACKET_ID: 2 | RETURN_CODE: 0
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: -> 10:19:47 PUBLISH | IS_DUP: false | RETAIN: 0 | QOS: DELIVER_AT_MOST_ONCE | TOPIC_NAME: $iothub/twin/GET/?$rid=3
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: <- 10:19:47 PUBLISH | IS_DUP: false | RETAIN: 0 | QOS: DELIVER_AT_MOST_ONCE | TOPIC_NAME: $iothub/twin/res/200/?$rid=3 | PAYLOAD_LEN: 52
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:47.6759Z 12176[12176] [I] Processing existing Device Twin data after agent started. [ADUC_PnPDeviceTwin_Callback]
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:47.6759Z 12176[12176] [D] Notifies components that all callback are subscribed. [ADUC_PnPDeviceTwin_Callback]
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:47.6760Z 12176[12176] [I] Perform startup tasks. [ADUC_Workflow_HandleStartupWorkflowData]
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:47.6760Z 12176[12176] [I] There's no update actions in current workflow (first time connected to IoT Hub). [ADUC_Workflow_HandleStartupWorkflowData]
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:47.6760Z 12176[12176] [D] cancellationType(1) => Normal [ADUC_Workflow_HandleUpdateAction]
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:47.6760Z 12176[12176] [W] Setting 'OperationCancelled' when there's no active workflow. (value:0) [workflow_set_operation_cancel_requested]
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:47.6760Z 12176[12176] [I] Cancel received with no operation in progress - returning to Idle state [ADUC_Workflow_HandleUpdateAction]
Mar 20 10:19:47 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:47.6971Z 12176[12176] [D] Queueing message (t:5, c:0x7c65c9d0, m:{"deviceUpdate":{"__t":"c","agent":{"deviceProperties":{"manufacturer":"Gaia","model":"raspberrypi4-64","env":"otn","interfaceId":null,"contractModelId":"dtmi:azure:iot:deviceUpdateContractModel;2","aduVer":"DU;agent\/1.0.2","doVer":"DU;lib\/v1.0.0+20221028.173411.b61de2d,DU;agent\/v1.0.0+20221028.173411.b61de2d"},"compatPropertyNames":"manufacturer,model"}}}) [ADUC_D2C_Message_SendAsync]
Mar 20 10:19:48 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:47.6973Z 12176[12176] [I] DeviceInformation component is ready - reporting properties [DeviceInfoInterface_Connected]
Mar 20 10:19:48 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:47.6974Z 12176[12176] [I] Property manufacturer changed to Gaia [RefreshDeviceInfoInterfaceData]
Mar 20 10:19:48 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:47.6975Z 12176[12176] [I] Property model changed to raspberrypi4-64 [RefreshDeviceInfoInterfaceData]
Mar 20 10:19:48 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:47.6976Z 12176[12176] [I] Property osName changed to Poky (Yocto Project Reference Distro) [RefreshDeviceInfoInterfaceData]
Mar 20 10:19:48 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:47.6977Z 12176[12176] [I] Property swVersion changed to 3.1.32 (dunfell) [RefreshDeviceInfoInterfaceData]
Mar 20 10:19:48 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:47.6977Z 12176[12176] [I] Property processorArchitecture changed to aarch64 [RefreshDeviceInfoInterfaceData]
Mar 20 10:19:48 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:47.7044Z 12176[12176] [I] Property processorManufacturer changed to ARM [RefreshDeviceInfoInterfaceData]
Mar 20 10:19:48 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:47.7044Z 12176[12176] [I] Property totalMemory changed to 1895724 [RefreshDeviceInfoInterfaceData]
Mar 20 10:19:48 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:47.7045Z 12176[12176] [I] Property totalStorage changed to 1848532 [RefreshDeviceInfoInterfaceData]
Mar 20 10:19:48 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:47.7045Z 12176[12176] [D] Queueing message (t:2, c:0x7c662de0, m:{"deviceInformation":{"__t":"c","manufacturer":"Gaia","model":"raspberrypi4-64","osName":"Poky (Yocto Project Reference Distro)","swVersion":"3.1.32 (dunfell)","processorArchitecture":"aarch64","processorManufacturer":"ARM","totalMemory":1895724,"totalStorage":1848532}}) [ADUC_D2C_Message_SendAsync]
Mar 20 10:19:48 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:47.7045Z 12176[12176] [I] DiagnosticsInterface is connected [DiagnosticsInterface_Connected]
Mar 20 10:19:48 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:47.7046Z 12176[12176] [D] Sending D2C message (t:2, retries:0). [ProcessMessage]
Mar 20 10:19:48 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:47.7046Z 12176[12176] [D] Sending D2C message:
Mar 20 10:19:48 raspberrypi4-64 AducIotAgent[12176]: {"deviceInformation":{"__t":"c","manufacturer":"Gaia","model":"raspberrypi4-64","osName":"Poky (Yocto Project Reference Distro)","swVersion":"3.1.32 (dunfell)","processorArchitecture":"aarch64","processorManufacturer":"ARM","totalMemory":1895724,"totalStorage":1848532}} [ADUC_D2C_Default_Message_Transport_Function]
Mar 20 10:19:48 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:47.7046Z 12176[12176] [D] Sending D2C message (t:5, retries:0). [ProcessMessage]
Mar 20 10:19:48 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:47.7046Z 12176[12176] [D] Sending D2C message:
Mar 20 10:19:48 raspberrypi4-64 AducIotAgent[12176]: {"deviceUpdate":{"__t":"c","agent":{"deviceProperties":{"manufacturer":"Gaia","model":"raspberrypi4-64","env":"otn","interfaceId":null,"contractModelId":"dtmi:azure:iot:deviceUpdateContractModel;2","aduVer":"DU;agent\/1.0.2","doVer":"DU;lib\/v1.0.0+20221028.173411.b61de2d,DU;agent\/v1.0.0+20221028.173411.b61de2d"},"compatPropertyNames":"manufacturer,model"}}} [ADUC_D2C_Default_Message_Transport_Function]
Mar 20 10:19:48 raspberrypi4-64 AducIotAgent[12176]: -> 10:19:47 SUBSCRIBE | PACKET_ID: 4 | TOPIC_NAME: $iothub/twin/PATCH/properties/desired/# | QOS: 0
Mar 20 10:19:48 raspberrypi4-64 AducIotAgent[12176]: <- 10:19:47 SUBACK | PACKET_ID: 4 | RETURN_CODE: 0
Mar 20 10:19:48 raspberrypi4-64 AducIotAgent[12176]: -> 10:19:48 PUBLISH | IS_DUP: false | RETAIN: 0 | QOS: DELIVER_AT_MOST_ONCE | TOPIC_NAME: $iothub/twin/PATCH/properties/reported/?$rid=5 | PAYLOAD_LEN: 270
Mar 20 10:19:48 raspberrypi4-64 AducIotAgent[12176]: -> 10:19:48 PUBLISH | IS_DUP: false | RETAIN: 0 | QOS: DELIVER_AT_MOST_ONCE | TOPIC_NAME: $iothub/twin/PATCH/properties/reported/?$rid=6 | PAYLOAD_LEN: 361
Mar 20 10:19:48 raspberrypi4-64 AducIotAgent[12176]: <- 10:19:48 PUBLISH | IS_DUP: false | RETAIN: 0 | QOS: DELIVER_AT_MOST_ONCE | TOPIC_NAME: $iothub/twin/res/204/?$rid=5&$version=2 | PAYLOAD_LEN: 0
Mar 20 10:19:48 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:48.3073Z 12176[12176] [D] context:0x76739cc0 [DefaultIoTHubSendReportedStateCompletedCallback]
Mar 20 10:19:48 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:48.3073Z 12176[12176] [D] D2C message processed successfully (t:2, r:0, content:0x7c6517c0 ) [DefaultIoTHubSendReportedStateCompletedCallback]
Mar 20 10:19:48 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:48.3073Z 12176[12176] [D] Send message completed (status:3) [OnDeviceInfoD2CMessageCompleted]
Mar 20 10:19:48 raspberrypi4-64 AducIotAgent[12176]: <- 10:19:48 PUBLISH | IS_DUP: false | RETAIN: 0 | QOS: DELIVER_AT_MOST_ONCE | TOPIC_NAME: $iothub/twin/res/204/?$rid=6&$version=3 | PAYLOAD_LEN: 0
Mar 20 10:19:48 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:48.3075Z 12176[12176] [D] context:0x76739eb8 [DefaultIoTHubSendReportedStateCompletedCallback]
Mar 20 10:19:48 raspberrypi4-64 AducIotAgent[12176]: 2024-03-20T10:19:48.3075Z 12176[12176] [D] D2C message processed successfully (t:5, r:0, content:0x7c65cb50 ) [DefaultIoTHubSendReportedStateCompletedCallback]

What we probably want to do is to create the device in IoT Hub, and a Module Identity. Where the Module Identity has a known name and an ADUGroup on its tags. So that the deviceupdate-agent on the device, can use that Module Identity and the ADUGroup specified on the Module Identitys twin.

gordonwang0 commented 3 months ago

You can use aziotctl system set-log-level to change the log level. I would expect more logs from identityd than just the "Started Azure IoT Identity Service" line.

You probably need to preserve /var/lib/aziot across updates so that aziot-identity-service maintains its state. That should prevent it from recreating the module identities when it restarts.

FinitelyFailed commented 3 months ago

You can use aziotctl system set-log-level to change the log level. I would expect more logs from identityd than just the "Started Azure IoT Identity Service" line.

Ok, I'll try that when I'm back at the office tomorrow. And get back to you.

You probably need to preserve /var/lib/aziot across updates so that aziot-identity-service maintains its state. That should prevent it from recreating the module identities when it restarts.

What we probably want to do is to create the device in IoT Hub, and a Module Identity. Where the Module Identity has a known name and an ADUGroup on its tags. So that the deviceupdate-agent on the device, can use that Module Identity and the ADUGroup specified on the Module Identitys twin. But that is not possible, you think?

gordonwang0 commented 3 months ago

What we probably want to do is to create the device in IoT Hub, and a Module Identity. Where the Module Identity has a known name and an ADUGroup on its tags. So that the deviceupdate-agent on the device, can use that Module Identity and the ADUGroup specified on the Module Identitys twin. But that is not possible, you think?

I'm not sure, as I primarily work with iotedge/iot-identity service and not ADU. You would have to try it out.

FinitelyFailed commented 3 months ago

But isn't this about iot-identity? As far as I understand deviceupdate-agent uses AIS to speak to the IoT hub. And it is a configuration of AIS that says that deviceupdate-agent shall be a module?

This is the AIS config for deviceupdate-agent (ADU):

# cat /etc/aziot/identityd/config.d/adu.toml 
[[principal]]
uid = 800
name = "DeviceUpdate"
idtype = ["module"]

I changed the log level to trace via aziotctl. And included the log as a file, as well as the log for deviceupdate-agent. deviceupdate-agent-logs.txt aziot-identity-log.txt

I can see that aziot-identity recreates the Module Identity, but I can't see a reason why.

gordonwang0 commented 3 months ago

From the Identity Service logs, there's a line:

Hub identity "DeviceUpdate" will be recreated

This happens when the identity is specified in configuration (as you have specified in adu.toml), but the identity isn't present in identityd's stored state. Identityd recreates the module to set the module identity key, as it cannot derive the key of a module not present in its stored state.

Therefore, it looks like you will need to persist everything under /var/lib/aziot across updates to prevent the module from being recreated.

FinitelyFailed commented 3 months ago

Thanks!

You should probably update the documentation. And point out that module identity may be recreated, if the current running instance of aziot-identityd didn't create the module identity. And that one should be aware that the module identity can be created and recreated. Because in our case we will probably need to create some kind of hook, trigger or similar to catch that the module identity is create. So that we can write tags and so on to its twin. We had hoped for that we only needed to create the module identity once in the cloud and at the same time write properties to it, but that will not work.

idtype (optional) is the identity type returned to that host process associated with the IoT Hub identity type being returned to that host process - it is an array containing module, device, or local. If idtype is not specified, the caller will be authorized to access all cloud identity resources (except those created by Identity Service for host processes). If the idtype contains module, the name will be used by the Identity Service to generate new cloud identities, if not created already. idtype with device is only used in special cases, when the host process needs to retrieve the provisioned device identity, instead of module identity.