Azure / iotedge

The IoT Edge OSS project
MIT License
1.45k stars 458 forks source link

EFLOW messages do not arrive in the hub #6981

Closed nachtiro closed 11 months ago

nachtiro commented 1 year ago

I am currently testing EFLOW. For this I have set it up as a transparent gateway. The downstream device is simulated by the C# example. So far everything works (a small problem is that the first message in the packet often arrives twice. But this is not so bad for now).

My problem occurs when the internet connection of the EFLOW is disconnected. In this case the EFLOW should cache the messages from the downstream device and send them to the hub when the internet connection is restored.

Recreate: Disconnect the EFLOW from the Internet. Execute the C# example to send 10 messages from a downstream device to the EFLOW. Reconnect the Eflow to the Internet.

Of the 10 messages, either only the first or none arrive in the hub.

Windows Host OS: edition: Windows 10 Enterprise version: 21H2 build: 19044.2604

EFLOW: VmConfiguration : @{ID=1358c4bedee6038; name=C-0620-EFLOW; properties=; tags=} VmPowerState : Running EdgeRuntimeVersion : @{IotEdgeVersion=1.4.9; MobyEngineVersion=20.10.14; MobyCliVersion=20.10.12} EdgeRuntimeStatus : @{SystemCtlStatus=System.Object[]; ModuleList=System.Object[]} SystemStatistics : @{TotalMemMb=3925; UsedMemMb=554; AvailableMemMb=3154; TotalStorageMb=15207; UsedStorageMb=425; AvailableStorageMb=14145; CpuCount=2; KernelVersion=5.15.92.1-2.cm2 Azure/iotedge-eflow#1 SMP Sat Feb 18 03:26:09 UTC 2023}

I have included an excerpt of the log file of the edgehub in the attachment for a failed attempt: edgeHub_log.txt

fcabrera23 commented 1 year ago

Hi @jlian,

Could you please take a look? Do you think this is something EFLOW-specific?

Thanks, Francisco

varunpuranik commented 1 year ago

Looking at the logs, it seems like EdgeHub is not able to reach IoT Hub, which is typically because of some network issue. Can you run the "iotedge check" command for a repro scenario, and provide the output of that here? https://learn.microsoft.com/en-us/azure/iot-edge/troubleshoot?view=iotedge-1.4#run-the-check-command Please make sure to sanitize the output for sensitive information before providing it here.

nachtiro commented 1 year ago

Thanks for the reply.

Once Eflow is reconnected to the internet it works as usual, however the offline messages are not uploaded. When I run the check command after the connection is restored it only brings 3 warnings about production readiness and one about the dns server. However, the command "resolvectl | grep eth0 -A 8" shows dns servers and also addresses can be resolved without problems. Check.txt DNS.txt

I have also tested 2 different, independent, internet accesses and had the same problem with both.

varunpuranik commented 1 year ago

EdgeHub is designed to store all messages locally when offline, and route them to IoT Hub when the device is back online. But depending on the value of the TTL, the messages could be deleted if the device is offline for longer than the TTL period. https://learn.microsoft.com/en-us/azure/iot-edge/offline-capabilities?view=iotedge-1.4#time-to-live

So, my questions would be:

If you have a consistent repro, it will be great if you can enable debug logs and provide a support bundle by opening a support ticket.

nachtiro commented 1 year ago

Hey,

I left the TTL value at default (2h). But this time was not exceeded during the tests.

I use Azure IoT Explorer (preview) to see the messages arriving in the hub, but also the metrics in the Azure Portal do not show any incoming messages in the problem cases.

I have created the support bundle. Should I submit the support request in the Azure portal? There I am told that a subscription is required for technical support, which I do not have for my test scenario.

Further info: I have now allocated more host resources (2 CPU cores and 4GB RAM memory) to the Eflow and set AMQPWS as the default upload protocol. Now the problem occurs much less often than before, although I don't know if this is directly related.

github-actions[bot] commented 1 year ago

This issue is being marked as stale because it has been open for 30 days with no activity.

gordonwang0 commented 11 months ago

Closing stale issue. Please let us know if you still need help with this.