Azure / azure-iot-sdk-c

A C99 SDK for connecting devices to Microsoft Azure IoT services
https://azure.github.io/azure-iot-sdk-c
Other
584 stars 739 forks source link

OPTION_MESSAGE_TIMEOUT client option depecation #2550

Closed bennydiamond closed 4 months ago

bennydiamond commented 8 months ago

We are using the Azure IoT SDK C source code package in our embedded/firmware project. We might have come into a situation where some devices have spotty communication with the cloud and outgoing messages (Device to cloud) are getting stuck and accumulated in doubly linked lists to the point the device runs out of memory (it's a firmware platform so, <1MB of SRAM can fill up quickly). We aren't sure if the connectivity issue come from the IoT device itself or the IotHub which might rate-limit some more "chatty" IoT devices. We do get a connectivity error at some point (which will flush the pending messages in send queue) but in certain scenarios, it might not be fast enough and a lot of outgoing messages could be queued up right before it.

We are connecting to the IotHub through MQTT.

I saw in the code there is a "Message Timeout" client option that can be enabled. However, there is a clear Deprecation notice right above it. Unfortunately, there is no replacement mechanism we could enable for a MQTT connection scheme to timeout messages in pending send queues in case they get stuck for a long time.

https://github.com/Azure/azure-iot-sdk-c/blob/40a62f61e887026ba04d8c7fe77f967cfd134d48/iothub_client/inc/iothub_client_options.h#L53C17-L53C17

Keeping in mind we are using MQTT, what other option is there to ensure we don't encounter a situation where outgoing messages would pile-up in an outgoing queue without timing out already queued-in messages?

Thank you.

domasgim commented 8 months ago

@bennydiamond we have encountered a similar situation in our solution using MQTT protocol as well, here's our solution to adjust queues message timeout, you can adjust these values according to your needs but with this patch the queued messages will be flushed after 10 seconds with one resend retry after 5 seconds:

This lowers timeout and retry values since we do not want
messages to be kept in internal SDK message queue for too long

There is OPTION_MESSAGE_TIMEOUT that may be used in the SDK
but it is deprecated

--- a/iothub_client/src/iothubtransport_mqtt_common.c
+++ b/iothub_client/src/iothubtransport_mqtt_common.c
@@ -46,8 +46,8 @@
 #define DEFAULT_CONNACK_TIMEOUT             30 // 30 seconds
 #define BUILD_CONFIG_USERNAME               24
 #define SAS_TOKEN_DEFAULT_LEN               10
-#define RESEND_TIMEOUT_VALUE_MIN            1*60
-#define TELEMETRY_MSG_TIMEOUT_MIN           2*60
+#define RESEND_TIMEOUT_VALUE_MIN            5
+#define TELEMETRY_MSG_TIMEOUT_MIN           10
 #define DEFAULT_CONNECTION_INTERVAL         30
 #define FAILED_CONN_BACKOFF_VALUE           5
 #define STATUS_CODE_FAILURE_VALUE           500

Or instead of a patch the two values right here https://github.com/Azure/azure-iot-sdk-c/blob/main/iothub_client/src/iothubtransport_mqtt_common.c#L50-L49

Do note that this may be considered a workaround for such problem but it suited our needs

ericwolz commented 8 months ago

connection and messaging reliability is documented here

We might have come into a situation where some devices have spotty communication with the cloud and outgoing messages (Device to cloud) are getting stuck and accumulated in doubly linked lists to the point the device runs out of memory

This SDK does not implement a max queue size as that is best determined by the application and platform requirements. You would need to implement an application dependent queue to be handle your application needs for messaging and monitor the sending of the messages and their corresponding status callbacks.

Normal operation of the SDK will temporarily hold on to the message and try to reconnect to the hub if disconnected due to infrastructure issue.

Connection Retry Policies can be set to application specific needs.

ewertons commented 4 months ago

Closing this issue based on the last comment from @ericwol-msft Please feel free to reopen it or open a new one if you would like to follow up. Thanks, Azure IoT SDK Team