Azure / azure-iot-sdk-csharp

A C# SDK for connecting devices to Microsoft Azure IoT services
Other
466 stars 493 forks source link

Device doesn't reconnect after switching wifi network #1000

Closed FlorisDevreese closed 5 years ago

FlorisDevreese commented 5 years ago

Setup

Description of the issue:

The DeviceClient doens't automatically create a new connection when switching wifi network.

Scenario

  1. Connect computer to wifi network A
  2. Run this example
  3. The example starts to publish dummy data to the IoT Hub.
  4. Disconnect frow wifi network A, and connect to wifi network B
  5. The example stops publishing data. No error is shown in the console.
  6. After some time the example throws the following error:
    Unhandled Exception: Microsoft.Azure.Devices.Client.Exceptions.IotHubCommunicationException: Transient network error occurred, please retry. ---> System.Net.Sockets.SocketException: Connection timed out
    at DotNetty.Transport.Channels.Sockets.TcpSocketChannel.DoReadBytes(IByteBuffer byteBuf)
    at DotNetty.Transport.Channels.Sockets.AbstractSocketByteChannel.SocketByteChannelUnsafe.FinishRead(SocketChannelAsyncOperation operation)
    --- End of stack trace from previous location where exception was thrown ---
    at Microsoft.Azure.Devices.Client.Transport.Mqtt.MqttIotHubAdapter.SendMessageAsync(IChannelHandlerContext context, Message message)
    at Microsoft.Azure.Devices.Client.Transport.Mqtt.MqttIotHubAdapter.WriteAsync(IChannelHandlerContext context, Object data)
    at Microsoft.Azure.Devices.Client.Transport.ErrorDelegatingHandler.<>c__DisplayClass22_0.<<ExecuteWithErrorHandlingAsync>b__0>d.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
    at Microsoft.Azure.Devices.Client.Transport.ErrorDelegatingHandler.ExecuteWithErrorHandlingAsync[T](Func`1 asyncOperation)
    --- End of inner exception stack trace ---
    at Microsoft.Azure.Devices.Client.Transport.ErrorDelegatingHandler.ExecuteWithErrorHandlingAsync[T](Func`1 asyncOperation)
    at Microsoft.Azure.Devices.Client.Transport.RetryDelegatingHandler.<>c__DisplayClass14_0.<<SendEventAsync>b__0>d.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
    at Microsoft.Azure.Devices.Client.Transport.RetryDelegatingHandler.SendEventAsync(Message message, CancellationToken cancellationToken)
    at Microsoft.Azure.Devices.Client.InternalClient.SendEventAsync(Message message)
    at simulated_device.SimulatedDevice.SendDeviceToCloudMessagesAsync() in /home/floris/temp/azure-iot-samples-csharp-master/iot-hub/Quickstarts/simulated-device-2/SimulatedDevice.cs:line 77
    at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
    --- End of stack trace from previous location where exception was thrown ---
    at System.Threading.ThreadPoolWorkQueue.Dispatch()

Expected result

I would expect that the Azure DeviceClient automatically creates a new connection via the new available network as described here and here.

More information

abhipsaMisra commented 5 years ago

Thank you for opening this issue. I am looking into this.

abhipsaMisra commented 5 years ago

I tried the above sample with slight modifications on Linux (ubuntu 18.04) and was unable to reproduce the issue. The steps I followed are:

  1. Modified the above sample to add a connection status change handler to the device client (to be notified of any connection status change events).
    int connectionStatusChangeCount = 0;
    s_deviceClient.SetConnectionStatusChangesHandler((status, statusChangeReason) =>
    {
    connectionStatusChangeCount++;
    Console.WriteLine($"{nameof(ConnectionStatusChangesHandler)}: status={status} statusChangeReason={statusChangeReason} count={connectionStatusChangeCount}");
    });
  2. Ran the sample.
  3. Ran the following command to kill TCP connection to my IoTHub:
    tcpkill -i eth0 host <IP_of_my_IoTHub>
  4. Stopped the tcpkill command after around ~1min.

Output:

$ dotnet TelemetryOnReconnect.dll mqtt
IoT Hub Quickstarts #1 - Simulated device. Ctrl-C to exit. - [mqtt]

ConnectionStatusChangesHandler: status=Connected statusChangeReason=Connection_Ok count=1
8/7/19 1:45:52 AM > Sending message: {"temperature":33.5093082271094,"humidity":62.066992354610463}
8/7/19 1:45:53 AM > Sending message: {"temperature":31.004088635558304,"humidity":70.0825022114825}
8/7/19 1:45:54 AM > Sending message: {"temperature":26.698869237070376,"humidity":71.578489268002329}
8/7/19 1:45:55 AM > Sending message: {"temperature":23.296417146127865,"humidity":73.805906983933369}
8/7/19 1:45:56 AM > Sending message: {"temperature":29.586119602707271,"humidity":76.258299972982286}
ConnectionStatusChangesHandler: status=Disconnected statusChangeReason=Communication_Error count=2
ConnectionStatusChangesHandler: status=Disconnected statusChangeReason=Retry_Expired count=3
ConnectionStatusChangesHandler: status=Connected statusChangeReason=Connection_Ok count=4
8/7/19 1:47:05 AM > Sending message: {"temperature":26.741427945318364,"humidity":68.415281040787363}
8/7/19 1:47:06 AM > Sending message: {"temperature":30.122881433983743,"humidity":62.825483792845851}
8/7/19 1:47:08 AM > Sending message: {"temperature":20.885903023130215,"humidity":78.944681183874934}
8/7/19 1:47:09 AM > Sending message: {"temperature":27.157481583839974,"humidity":75.067271327165543}
8/7/19 1:47:10 AM > Sending message: {"temperature":31.124160432314575,"humidity":67.876231590228258}

I have the default retry policy enabled, which is ExponentialBackoff; with a DefaultOperationTimeout of 4mins.

I will retry this specifically with switching wifi networks, just so we cover all bases; in the meanwhile, could you provide the application logs with the below modifications:

  1. After initializing the device client, add the connection status change handler:
    int connectionStatusChangeCount = 0;
    s_deviceClient.SetConnectionStatusChangesHandler((status, statusChangeReason) =>
    {
    connectionStatusChangeCount++;
    Console.WriteLine($"{nameof(ConnectionStatusChangesHandler)}: status={status} statusChangeReason={statusChangeReason} count={connectionStatusChangeCount}");
    });
  2. Add the below file to output SDK logs into the console: Note that this method will substantially slow down execution. a. Add common\test\ConsoleEventListener.cs to your project. b. Instantiate the listener.
    private readonly ConsoleEventListener _listener = new ConsoleEventListener(new string[]{ "Microsoft-Azure-", "DotNetty-" });

Please share the logs generated with the above modifications. Thanks!

prmathur-microsoft commented 5 years ago

As we have not heard from you in a long time, I'm closing this issue. feel free to reopen when you have more information.

az-iot-builder-01 commented 5 years ago

@fldvrees, @abhipsaMisra, @prmathur-microsoft, thank you for your contribution to our open-sourced project! Please help us improve by filling out this 2-minute customer satisfaction survey