Azure / iot-edge-v1

Azure IoT Edge
http://azure.github.io/iot-edge/
Other
524 stars 258 forks source link

tempSensor exception #514

Closed rmr closed 6 years ago

rmr commented 6 years ago

I have been trying to get the edge runtime to work on Linux (armbian). The Agent and Hub seem to be running fine on the device. From within the azure web control interface, I can see there are twin updates. Can these be seen as a type of heartbeats?

The temperature example doesn't seem to work. On two x86 test machines, the example works fine (simulating devices)

The ARM machine has 2GB RAM with only 9% being used ( I think 1GB would suffice).

$ uname -a Linux clearfogbase 4.14.17-mvebu #34 SMP Mon Feb 5 09:46:56 CET 2018 armv7l armv7l armv7l GNU/Linux $ docker -v Docker version 18.01.0-ce, build 03596f5 $ iotedgectl --version iotedgectl 1.0.0rc19

Unhandled Exception: System.AggregateException: One or more errors occurred. (Operation timeout expired.) ---> System.TimeoutException: Operation timeout expired. at Microsoft.Azure.Devices.Client.DeviceClient.<>cDisplayClass79_0.b2(Task t) at System.Threading.Tasks.ContinuationTaskFromTask.InnerInvoke() at System.Threading.Tasks.Task.<>c.<.cctor>b276_1(Object obj) at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state) at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot) --- End of stack trace from previous location where exception was thrown --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at SimulatedTemperatureSensor.Program.d4.MoveNext() in /opt/vsts/work/1/s/edge-modules/SimulatedTemperatureSensor/src/Program.cs:line 60 --- End of inner exception stack trace --- at System.Threading.Tasks.Task1.GetResultCore(Boolean waitCompletionNotification) at System.Threading.Tasks.Task1.get_Result() at SimulatedTemperatureSensor.Program.Main() in /opt/vsts/work/1/s/edge-modules/SimulatedTemperatureSensor/src/Program.cs:line 27

mrohera commented 6 years ago

More about Twins in general and specifically how they are used for modules can be learnt at these links:

Could you send us the Edge Hub logs for the exception you hare reporting?

rmr commented 6 years ago

So far, I am not able to figure out the azure web panel logging and storage stuff. So I have the logs from the agent and hub with verbose

It seems to not think there is network. I have done a docker exec ping and others, and there is network.

Edgehub

2018-02-12 10:11:31.853 +00:00 [INF] [EdgeHub] - Starting MQTT Server 2018-02-12 10:11:32.059 +00:00 [INF] [Microsoft.Azure.Devices.Edge.Hub.Mqtt.MqttProtocolHead] - Starting 2018-02-12 10:11:32.286 +00:00 [DBG] [DotNetty.Buffers.PooledByteBufferAllocator] - -Dio.netty.allocator.numHeapArenas: 4 2018-02-12 10:11:32.287 +00:00 [DBG] [DotNetty.Buffers.PooledByteBufferAllocator] - -Dio.netty.allocator.pageSize: 8192 2018-02-12 10:11:32.287 +00:00 [DBG] [DotNetty.Buffers.PooledByteBufferAllocator] - -Dio.netty.allocator.maxOrder: 11 2018-02-12 10:11:32.287 +00:00 [DBG] [DotNetty.Buffers.PooledByteBufferAllocator] - -Dio.netty.allocator.chunkSize: 16777216 2018-02-12 10:11:32.287 +00:00 [DBG] [DotNetty.Buffers.PooledByteBufferAllocator] - -Dio.netty.allocator.tinyCacheSize: 512 2018-02-12 10:11:32.287 +00:00 [DBG] [DotNetty.Buffers.PooledByteBufferAllocator] - -Dio.netty.allocator.smallCacheSize: 256 2018-02-12 10:11:32.287 +00:00 [DBG] [DotNetty.Buffers.PooledByteBufferAllocator] - -Dio.netty.allocator.normalCacheSize: 64 2018-02-12 10:11:32.288 +00:00 [DBG] [DotNetty.Buffers.PooledByteBufferAllocator] - -Dio.netty.allocator.maxCachedBufferCapacity: 32768 2018-02-12 10:11:32.288 +00:00 [DBG] [DotNetty.Buffers.PooledByteBufferAllocator] - -Dio.netty.allocator.cacheTrimInterval: 8192 2018-02-12 10:11:32.327 +00:00 [INF] [Microsoft.Azure.Devices.Edge.Hub.Mqtt.MqttProtocolHead] - Initializing TLS endpoint on port 8883. 2018-02-12 10:11:32.398 +00:00 [DBG] [DotNetty.Transport.Channels.DefaultChannelId] - -Dio.netty.processId: 3409367 (auto-detected) 2018-02-12 10:11:32.557 +00:00 [DBG] [DotNetty.Transport.Channels.DefaultChannelId] - -Dio.netty.machineId: 02:42:AC:FF:FE:12:00:04 (auto-detected) 2018-02-12 10:11:32.563 +00:00 [DBG] [DotNetty.Buffers.ByteBufferUtil] - -Dio.netty.allocator.type: pooled 2018-02-12 10:11:32.609 +00:00 [DBG] [DotNetty.Common.ResourceLeakDetector] - io.netty.leakDetection.level: simple 2018-02-12 10:11:32.609 +00:00 [DBG] [DotNetty.Common.ResourceLeakDetector] - io.netty.leakDetection.maxRecords: 4 2018-02-12 10:11:32.789 +00:00 [INF] [Microsoft.Azure.Devices.Edge.Hub.Mqtt.MqttProtocolHead] - Started ^T2018-02-12 10:14:43.366 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Hub.Core.TwinManager] - ConnectionLost for clearfogpro-s1/$edgeHub 2018-02-12 10:14:43.906 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Hub.Core.TwinManager] - ConnectionEstablished for clearfogpro-s1/$edgeHub 2018-02-12 10:14:43.907 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Hub.Core.TwinManager] - Processing ConnectionEstablished for device clearfogpro-s1/$edgeHub 2018-02-12 10:14:43.920 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Hub.Core.TwinManager] - Getting twin for clearfogpro-s1/$edgeHub on ConnectionEstablished 2018-02-12 10:14:43.922 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Hub.Mqtt.SessionStateStoragePersistenceProvider] - Cloud connection established, setting subscriptions for clearfogpro-s1/$edgeHub 2018-02-12 10:14:43.926 +00:00 [INF] [Microsoft.Azure.Devices.Edge.Hub.Mqtt.SessionStateStoragePersistenceProvider] - No session state found in store for clearfogpro-s1/$edgeHub 2018-02-12 10:14:43.976 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Hub.CloudProxy.CloudProxy] - Getting twin for device clearfogpro-s1/$edgeHub 2018-02-12 10:14:43.976 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Hub.Core.TwinManager] - Successfully got twin for clearfogpro-s1/$edgeHub from cloud at desired version 2 reported version 15 2018-02-12 10:14:43.977 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Hub.Core.TwinManager] - Updating cached twin for clearfogpro-s1/$edgeHub from desired version 2 to 2 and reported version 14 to 15

2018-02-12 10:19:16.423 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Hub.Core.TwinManager] - ConnectionLost for clearfogpro-s1/$edgeHub

Edgehub

2018-02-12 10:40:51.673 +00:00 [INF] [Microsoft.Azure.Devices.Edge.Agent.Core.LoggingCommandFactory] - Executing command: "docker stop -t 10 tempSensor" 2018-02-12 10:40:51.677 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Agent.Core.LoggingCommandFactory] - Executing command for operation ["stop"] succeeded. 2018-02-12 10:40:51.678 +00:00 [INF] [Microsoft.Azure.Devices.Edge.Agent.Core.LoggingCommandFactory] - Executing command: "docker start tempSensor" 2018-02-12 10:40:52.954 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Agent.Core.LoggingCommandFactory] - Executing command for operation ["start"] succeeded. 2018-02-12 10:40:52.954 +00:00 [INF] [Microsoft.Azure.Devices.Edge.Agent.Core.LoggingCommandFactory] - Executing command: "Update health stats for module tempSensor" 2018-02-12 10:40:52.955 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Agent.Core.LoggingCommandFactory] - Executing command for operation ["Update health stats for module tempSensor"] succeeded. 2018-02-12 10:40:52.956 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Agent.Core.LoggingCommandFactory] - Executing command for operation ["Command Group: ( [docker stop -t 10 tempSensor] [docker start tempSensor] [Update health stats for module tempSensor] )"] succeeded.

... 2018-02-12 10:44:09.235 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Agent.Core.Planners.HealthRestartPlanner] - HealthRestartPlanner created Plan, with 0 command(s). 2018-02-12 10:44:14.255 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Agent.Core.Planners.HealthRestartPlanner] - HealthRestartPlanner created Plan, with 0 command(s). 2018-02-12 10:44:17.427 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Agent.IoTHub.EdgeAgentConnection] - Connection status changed to Disconnected_Retrying with reason No_Network 2018-02-12 10:44:17.978 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Agent.IoTHub.EdgeAgentConnection] - Connection status changed to Connected with reason Connection_Ok 2018-02-12 10:44:18.125 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Agent.IoTHub.EdgeAgentConnection] - Edge agent updated deployment config from desired properties. 2018-02-12 10:44:18.125 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Agent.IoTHub.EdgeAgentConnection] - Updated edge agent configuration from twin. 2018-02-12 10:44:19.281 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Agent.Core.Planners.HealthRestartPlanner] - HealthRestartPlanner created Plan, with 0 command(s). ..... 2018-02-12 10:44:59.519 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Agent.Core.Planners.HealthRestartPlanner] - HealthRestartPlanner created Plan, with 0 command(s). 2018-02-12 10:44:59.701 +00:00 [INF] [Microsoft.Azure.Devices.Edge.Agent.IoTHub.Reporters.IoTHubReporter] - Updated reported properties 2018-02-12 10:45:04.720 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Agent.Core.Planners.HealthRestartPlanner] - HealthRestartPlanner created Plan, with 0 command(s). 2018-02-12 10:45:09.761 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Agent.Core.Planners.HealthRestartPlanner] - HealthRestartPlanner created Plan, with 0 command(s).

rmr commented 6 years ago

Every now and then (without changing anything)

2018-02-12 14:37:15.371 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Agent.Core.Planners.HealthRestartPlanner] - HealthRestartPlanner created Plan, with 0 command(s). 2018-02-12 14:37:18.553 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Agent.IoTHub.EdgeAgentConnection] - Connection status changed to Disconnected_Retrying with reason No_Network 2018-02-12 14:37:19.078 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Agent.IoTHub.EdgeAgentConnection] - Connection status changed to Connected with reason Connection_Ok 2018-02-12 14:37:19.124 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Agent.IoTHub.EdgeAgentConnection] - Edge agent updated deployment config from desired properties. 2018-02-12 14:37:19.124 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Agent.IoTHub.EdgeAgentConnection] - Updated edge agent configuration from twin.

rmr commented 6 years ago

I have been debugging with tcpdump. It seems as though after 10-15 minutes, edgeAgent reports

_Connection status changed to Disconnected_Retrying with reason NoNetwork

I then used docker exec, and attempted curl with the port number 5671, which could connect.

From the azure web panel, it always says disconnected (after the 10-15 minutes). When the connection is available, I can see the following using tcpdump.

IP 13.69.192.43.amqps > 172.18.0.2.58945: Flags [P.], seq 12923:12992, ack 4743, win 512, options [nop,nop,TS val 28506681 ecr 140332479], length 69 IP 172.18.0.2.58945 > 13.69.192.43.amqps: Flags [.], ack 12992, win 137, options [nop,nop,TS val 140420673 ecr 28506681], length 0

It seems the server (azure) quits querying the edge device. I don't see any unanswered messages from the azure server to the edge device.

rmr commented 6 years ago

This is specific to the RPI board. It seems other boards won't work with this example.