Azure / azure-iot-sdk-c

A C99 SDK for connecting devices to Microsoft Azure IoT services
https://azure.github.io/azure-iot-sdk-c
Other
585 stars 739 forks source link

Async responses for Direct Methods #2570

Closed oscarh closed 4 months ago

oscarh commented 7 months ago

When receiving a Direct Method to the Azure IoT C SDK, the current API forces the SDK callback to deliver the response synchronously. At the same time there is documentation saying that "Your callback should in general limit itself to quick, CPU bound operations" [1]. This does make Direct Methods rather useless, if the method need to do some more work than just copy data as per your sample code. In our case, we want to pass Direct Methods to sub-systems and return the response when that sub-system has replied. We cannot block the whole SDK while waiting for those responses.

There was an API, IoTHubClient_SetDeviceMethodCallback_Ex in the (now deprecated) iothub_client/inc/iothub_client.h. I guess we are discouraged to use this API.

This question seems to be answered in quite a few places, but there is no clear answer from your side. For instance #1647 and #1437. #1647 was closed, since the work should be done in #1437, but that is closed without any good explanation. There is a discussion here

1984, with a comment from 2021 stating "Will update you soon." Not sure how soon that is?

[1] - https://github.com/Azure/azure-iot-sdk-c/blob/main/doc/threading_notes.md#avoid-long-running-callbacks

Update: I refer to the multi-threaded functions above, but wanted to talk mainly about: IoTHubClient_LL_SetDeviceMethodCallback_Ex and IoTHubDeviceClient_LL_SetDeviceMethodCallback

ericwolz commented 7 months ago

IoTHubDeviceClient_LL_SetDeviceMethodCallback The LL (low-level) APIs are not multithreaded, so any callback will block the SDK pipeline including MQTT communication. Blocking this for a long time will cause MQTT PUBACK and PINGREQ timeouts.

IoTHubDeviceClient_SetDeviceMethodCallback The non LL APis uses the multithreaded SDK. All callbacks are stored on a queue and notification are sent to the application on another thread that is not part of the SDK pipeline.

oscarh commented 6 months ago

@ericwol-msft thank you for the reply. We are uing the LL API and understand this limitation. The problem is that the IOTHUB_CLIENT_DEVICE_METHOD_CALLBACK_ASYNC callback registered in IoTHubDeviceClient_LL_SetDeviceMethodCallback MUST deliver a repsonse, or else the SDK will consider the direct method as failed iothub_client/src/iothub_client_core_ll.c:

                if (payload_resp != NULL && response_size > 0)
                {
                    result = handleData->IoTHubTransport_DeviceMethod_Response(handleData->deviceHandle, response_id, payload_resp, response_size, result);
                }
                else
                {
                    result = MU_FAILURE;
                }
oscarh commented 6 months ago

Regarding multithreaded API. Is the queue handling synchronous, meaning that one blocking direct method would block other? In this case, this would also be a big problem for users of that API.

oscarh commented 6 months ago

I see that I referred to the wrong functions above. The function I wanted to use is IoTHubClient_LL_SetDeviceMethodCallback_Ex which is deprecated:

IoTHubClient_LL_SetDeviceMethodCallback_Ex is deprecated. Use IoTHubDeviceClient_LL_SetDeviceMethodCallback() instead.

ericwolz commented 6 months ago

Regarding multithreaded API. Is the queue handling synchronous, meaning that one blocking direct method would block other? In this case, this would also be a big problem for users of that API.

Yes, there is only one worker thread for the callbacks, The callback blocks other notifications.

ewertons commented 6 months ago

Hi @oscarh , thanks for your question. Just to make it clear, functions or headers that have been marked as deprecated MUST not be used.

Regarding your question about direct methods, this feature is implemented to have a short life-cycle between request and response. thus the reason to have the reply at the return of the request callback. This is for optimization of resources and scalability on the server side.

If you do have a command that must take a longer time to execute and then report a status, you must implement the request as a begin and end design. First fire a direct method requesting an command to be started in your client (e.g., my_command_begin) and then after a certain time (estimated by you in your implementation), send another direct method requesting either a status or the final report of the execution (e.g., my_command_end).

oscarh commented 6 months ago

Hi @ewertons,

Thanks for you reply.

I'm afraid your comment doesn't make much sense though, since the REST API for calling a direct method has a responseTimeoutInSeconds which can be between 5 and 300 seconds. [1]. I hope we're not expecting the IoT Hub to add 300 seconds latency? Apart from that, there is a connectTimeoutInSeconds which can also be up to 300 seconds. During the second timeout the hub will wait for an offline device to come online. If the backend can handle this in a scalable way, it dosen't make sense that the device must only do "quick, CPU bound operations".

Furthermore, if I would try to implement your suggestion with a start and and end part, the end part would need to poll the device, using direct methods, since I cannot know when the response is ready. It would probably consume more resources everywhere.

Actually the whole things feels like something went wrong during a refactoring, why would otherwise the type for the callback be named IOTHUB_CLIENT_DEVICE_METHOD_CALLBACK_ASYNC although it cannot do any async operation?

[1] - https://learn.microsoft.com/en-us/azure/iot-hub/iot-hub-devguide-direct-methods#method-invocation

ewertons commented 6 months ago

Hi @oscarh ,

Unfortunately the Azure IoT C SDK does not implement a way to receive direct methods requests and provide responses independently.

The options I shared are alternatives given the current implementation and limitations of the Azure IoT C SDK. If you are looking for more granular control of an SDK for Azure IoT Hub, please take a look at Azure SDK for C, which is also more suited for embedded devices.

oscarh commented 6 months ago

HI @ewertons ,

But it did support it until the function IoTHubClient_SetDeviceMethodCallback_Ex in header iothub_client/inc/iothub_client.h was deprecated in commit e3e0a0bdfac5755b170bc09091b83a2857843f3a on Tue Oct 27 08:46:40 2020 -0600.

ewertons commented 5 months ago

That seems correct. At this time, we do not plan to implement this enhancement in the currently supported API headers to match the behavior of the deprecated header you mentioned.

ewertons commented 4 months ago

Closing this issue based on the latest triage decision. Thanks, Azure IoT SDK Team