Azure / azure-iot-sdk-java

A Java SDK for connecting devices to Microsoft Azure IoT services
https://azure.github.io/azure-iot-sdk-java/
Other
200 stars 237 forks source link

Issue Title: DeviceClient Fails to Reconnect After Extended Internet Outage in Staging Environment #1797

Closed PratheeshShekar closed 1 month ago

PratheeshShekar commented 1 month ago

Description: We have deployed our application on Docker in an on-premises environment. The application creates a com.microsoft.azure.sdk.iot.device.DeviceClient using an AMQPS connection and listens for incoming data from IoT Hub using the following callback:

deviceClient.setMessageCallback((message, callbackContext) -> { // Handle incoming message });

com.microsoft.azure.sdk.iot iot-device-client 2.5.0

Observed Behavior: Everything works fine in our local environment. However, when testing in the staging environment, we encounter an issue:

Scenario: If the internet connection is interrupted for an extended period (10-15 minutes), and then the internet connection is restored, the DeviceClient does not receive any messages from IoT Hub until we restart the application.

Expected Behavior: Even if the internet is down for a long time, once the connection is restored, the DeviceClient should automatically resume receiving messages from IoT Hub without requiring a restart of the application.

Steps to Reproduce: Deploy the application with DeviceClient using the AMQPS protocol on a Docker container in an on-premises environment. Interrupt the internet connection for 10-15 minutes. Restore the internet connection. Observe that the application no longer receives data from IoT Hub until the application is restarted.

Environment: Docker-based on-premises environment com.microsoft.azure.sdk.iot.device.DeviceClient version 2.5.0 AMQPS protocol

Questions: Is this a known issue with the version of the iot-device-client library we are using? Are there any recommended configurations or changes that would allow the DeviceClient to automatically reconnect and resume receiving messages after long internet outages?

timtay-microsoft commented 1 month ago

The device client library will reconnect for roughly ~4 minutes before giving up by default. This value is configurable, but we generally recommend you also add retry logic within your application layer. We have some sample code that demonstrates how to do that.

PremSahooESL commented 1 month ago

Thanks for the sample retry code. We will try handling this way. Will let you know the result asap.

PratheeshShekar commented 1 month ago

Thanks for the solution, now Its work fine