Azure / iot-hub-device-update

Device Update for IoT Hub agent
MIT License
51 stars 36 forks source link

AducIotAgent cannot find custom step handler #626

Open MoritzSKB opened 1 month ago

MoritzSKB commented 1 month ago

Expected Behaviour

The custom handler should be found and run normally

Current Behaviour

When running a device update using the custom agent, it almost immediately flips to the failed state. Checking the logs, this is caused by the .so of my custom handler not being found.

Steps to Reproduce

Assuming custom handler structure is the same as other step handlers and found in the same directory:

  1. Add custom handler to main CMake file and build.sh in local repository instance (Same version as on target device)
  2. Build custom handler locally using the command: ./scripts/build.sh -t Release --step-handlers kb/tf
  3. Copy custom handlers .so from ./out/lib/ to device and put file in correct folder
  4. Register custom handler on device using AducIotAgent --extension-type updateContentHandler --register-extension /var/lib/adu/extensions/sources/libkb_tf_1.so --extension-id 'kb/tf:1'
  5. Run an update using the Azure Portal, reference handler via 'kb/tf:1'

Device Information

Logs

Relevant snippet of /var/log/adu/du-agent.20240515-134117.log:

...
2024-05-15T14:10:06.0414Z 2446[2686] [D] Determining contract version for 'microsoft/update-manifest:5'. [LoadUpdateContentHandlerExtension:303]
2024-05-15T14:10:06.0414Z 2446[2686] [D] Got 1.0 contract version for 'microsoft/update-manifest:5' handler [LoadUpdateContentHandlerExtension:331]
2024-05-15T14:10:06.0414Z 2446[2686] [D] Caching new handler for 'microsoft/update-manifest:5'. [LoadUpdateContentHandlerExtension:340]
2024-05-15T14:10:06.0692Z 2446[2686] [I] Action 'Download' complete. Result: 0 (failed), 805306373 (0x30000005) [ADUC_Workflow_WorkCompletionCallback:890]
2024-05-15T14:10:06.0692Z 2446[2686] [I] WorkCompletionCallback: Download failed. Going to state Failed [ADUC_Workflow_WorkCompletionCallback:1016]
2024-05-15T14:10:06.0692Z 2446[2686] [I] Setting UpdateState to Failed [ADUC_Workflow_SetUpdateStateHelper:1088]

2024-05-15T14:10:06.0693Z 2446[2686] [D] [ADUC_D2C_Message_SendAsync:526] ==== MULTI-LINE LOG BEGIN ====
Queueing message (t:0, c:0x84002b60, m:{"deviceUpdate":{"__t":"c","agent":{"lastInstallResult":{"stepResults":{"step_0":{"resultCode":0,"extendedResultCodes":"30000005","resultDetails":null}},
"resultCode":0,"extendedResultCodes":"30000005,A0000FFF","resultDetails":"Cannot load a handler for step #0 (handler :kb\/tf:1)"},
"state":255,"workflow":{"action":3,"id":"22c7c7b9-acf7-4bf7-a20f-b7d7e55e1db6"}}}})
2024-05-15T14:10:06.0693Z 2446[2686] [D] [ADUC_D2C_Message_SendAsync:526] ==== MULTI-LINE LOG END ====
...

(newlines added for readablity)

Content of /var/log/adu/steps-handler.20240515-141006.log:

2024-05-15T14:10:06.0414Z 2446[2686] [I] Instantiating an Update Manifest Handler for MSOE [CreateUpdateContentHandlerExtension:30]
2024-05-15T14:10:06.0417Z 2446[2686] [I] Loading handler for step #0 (handler: 'kb/tf:1') [StepsHandler_Download:608]
2024-05-15T14:10:06.0417Z 2446[2686] [I] Loading handler for 'kb/tf:1'. [LoadUpdateContentHandlerExtension:208]
2024-05-15T14:10:06.0419Z 2446[2686] [I] Loading extension 'kb/tf:1'. Reg file : /var/lib/adu/extensions/update_content_handlers/kb_tf_1/content_handler.json [LoadExtensionLibrary:85]
2024-05-15T14:10:06.0689Z 2446[2686] [E] Cannot load handler file /var/lib/adu/extensions/sources/libkb_tf_1.so. /var/lib/adu/extensions/sources/libkb_tf_1.so: cannot open shared object file: No such file or directory. [LoadExtensionLibrary:149]
2024-05-15T14:10:06.0691Z 2446[2686] [I] Uninitializing config info. [ADUC_ConfigInfo_UnInit:525]
2024-05-15T14:10:06.0691Z 2446[2686] [E] Cannot load a handler for step #0 (handler :kb/tf:1) [StepsHandler_Download:615]

Additional Information

I am trying to implement a custom step handler for the AducIotAgent, following this guide, however I cannot seem to get it to work.

When I try to push an update using the Azure Portal, I get an error (found in the steps-handler log file). This error is generated because the function ADUCPAL_dlopen, run here, fails to open the handler. This function, found in src/libaducpal/src/dlfcn.c seems to use an external function (LoadLibrary()) defined in windows.h to load the library.

So I cannot check in further detail where this error arises, and the error message is not clear about what is causing the problem. I have checked multiple times, and the filepath is correct. The hash of the file is also correct, otherwise the program would have responded with a different error. So the agent can verify that the hash of my custom handler is correct, but right afterwards it tells me that 'no such file exists'? I find this contradictory. Did I build the custom handler wrong? Or did I not set some parameters correctly? How do I fix this issue?

I already spoke to Microsoft Support using a support request in the Azure Portal. I was told to open an issue here, with a reference to that support ticket. It's ID is 2404110050002607

Nox-MSFT commented 1 month ago

@MoritzSKB , thank you for reporting this issue. Could you check/share the owner and permission of the installed extension (/var/lib/adu/extensions/sources/libkb_tf_1.so) please?

MoritzSKB commented 1 month ago

@Nox-MSFT I checked the permissions and even with them being set to the same as the other handler .so files, the error persists.

Nox-MSFT commented 1 month ago

@MoritzSKB are you able to debug it? I'm wondering if this is a dependency issue. How did you build your extension? Is there any dependency that need to be installed on the device?

(according to this error: "cannot open shared object file: No such file or directory.")

MoritzSKB commented 3 weeks ago

@Nox-MSFT I am not too familiar with the project, so I can only go off the logs I sent here before. I cloned this GitHub project on a local Linux machine, in which I build my extension by adding it to the respective CMake files and the build.sh script, so that it would be included. I did not change any of the scripts/files besides adding my handler as an option to the list of other step_handlers. I don't think it is an issue with dependencies, as we have tested the default script handlers and got a response. And our custom handler does not use any functionality that is not being used by other handlers already.