Azure / iotedge

The IoT Edge OSS project
MIT License
1.46k stars 460 forks source link

IoTEdge offline support #3533

Closed abhatikar closed 3 years ago

abhatikar commented 4 years ago

I have successfully setup a leaf device connected to the iotedge gw which is a transparent gateway with offline support using the configuration based on guide
Question 1: During testing, when i disconnect the iot edge from the internet, the messages still send from the leaf device. I see the size of the mounted storage directory increases. However, when the connectivity is restored, the storage is not cleared. Need your inputs here.

Question 2: How can I test that the messages stored were forwarded to the iot hub when the connectivity was restored ? I could not see the old messages in the iot explorer telemetry. Need your inputs here.

Thanks in advance, ` Iotedge check output iotedge 1.0.9.4

Configuration checks

√ config.yaml is well-formed - OK √ config.yaml has well-formed connection string - OK √ container engine is installed and functional - OK √ config.yaml has correct hostname - OK √ config.yaml has correct URIs for daemon mgmt endpoint - OK √ latest security daemon - OK √ host time is close to real time - OK √ container time is close to host time - OK ‼ DNS server - Warning Container engine is not configured with DNS server setting, which may impact connectivity to IoT Hub. Please see https://aka.ms/iotedge-prod-checklist-dns for best practices. You can ignore this warning if you are setting DNS server per module in the Edge deployment. caused by: Could not open container engine config file /etc/docker/daemon.json caused by: No such file or directory (os error 2) √ production readiness: certificates - OK ‼ production readiness: container engine - Warning Device is not using a production-supported container engine (moby-engine). Please see https://aka.ms/iotedge-prod-checklist-moby for details. ‼ production readiness: logs policy - Warning Container engine is not configured to rotate module logs which may cause it run out of disk space. Please see https://aka.ms/iotedge-prod-checklist-logs for best practices. You can ignore this warning if you are setting log policy per module in the Edge deployment. caused by: Could not open container engine config file /etc/docker/daemon.json caused by: No such file or directory (os error 2) √ production readiness: Edge Agent's storage directory is persisted on the host filesystem - OK √ production readiness: Edge Hub's storage directory is persisted on the host filesystem - OK

Connectivity checks

√ host can connect to and perform TLS handshake with IoT Hub AMQP port - OK √ host can connect to and perform TLS handshake with IoT Hub HTTPS / WebSockets port - OK √ host can connect to and perform TLS handshake with IoT Hub MQTT port - OK √ container on the default network can connect to IoT Hub AMQP port - OK √ container on the default network can connect to IoT Hub HTTPS / WebSockets port - OK √ container on the default network can connect to IoT Hub MQTT port - OK √ container on the IoT Edge module network can connect to IoT Hub AMQP port - OK √ container on the IoT Edge module network can connect to IoT Hub HTTPS / WebSockets port - OK √ container on the IoT Edge module network can connect to IoT Hub MQTT port - OK

20 check(s) succeeded. 3 check(s) raised warnings. `

philipktlin commented 4 years ago

Ans1: It takes time to send all stored messages to upstream. Behind the scene, message cleanup task is running every 30 mins and DB compaction is running every 2 hours. Both of these settings are not open for configuration. Ans2: You should be able to see all the messages if messages are not expired based on TTL settings. By default, TTL is 2hours. Please try take offline with shorter period (e.g. 5 mins) for testing and see if all messages are delivered to iot hub.

abhatikar commented 4 years ago

regarding Ans1, I have tried kept did not touch the storage mount but there is no change in the size after 10 hours. I will try the shorter TTL and let you know.

Thanks for your response.

cheers,

philipktlin commented 4 years ago

Did you turn on log rotation? https://docs.microsoft.com/en-us/azure/iot-edge/production-checklist#place-limits-on-log-size

abhatikar commented 4 years ago

Yes,

Configuration checks

√ config.yaml is well-formed - OK √ config.yaml has well-formed connection string - OK √ container engine is installed and functional - OK √ config.yaml has correct hostname - OK √ config.yaml has correct URIs for daemon mgmt endpoint - OK ‼ latest security daemon - Warning Installed IoT Edge daemon has version 1.0.9.4 but 1.0.9.5 is the latest stable version available. Please see https://aka.ms/iotedge-update-runtime for update instructions. √ host time is close to real time - OK √ container time is close to host time - OK ‼ DNS server - Warning Container engine is not configured with DNS server setting, which may impact connectivity to IoT Hub. Please see https://aka.ms/iotedge-prod-checklist-dns for best practices. You can ignore this warning if you are setting DNS server per module in the Edge deployment. √ production readiness: certificates - OK ‼ production readiness: container engine - Warning Device is not using a production-supported container engine (moby-engine). Please see https://aka.ms/iotedge-prod-checklist-moby for details. √ production readiness: logs policy - OK √ production readiness: Edge Agent's storage directory is persisted on the host filesystem - OK √ production readiness: Edge Hub's storage directory is persisted on the host filesystem - OK

Connectivity checks

√ host can connect to and perform TLS handshake with IoT Hub AMQP port - OK √ host can connect to and perform TLS handshake with IoT Hub HTTPS / WebSockets port - OK √ host can connect to and perform TLS handshake with IoT Hub MQTT port - OK √ container on the default network can connect to IoT Hub AMQP port - OK √ container on the default network can connect to IoT Hub HTTPS / WebSockets port - OK √ container on the default network can connect to IoT Hub MQTT port - OK √ container on the IoT Edge module network can connect to IoT Hub AMQP port - OK √ container on the IoT Edge module network can connect to IoT Hub HTTPS / WebSockets port - OK √ container on the IoT Edge module network can connect to IoT Hub MQTT port - OK

20 check(s) succeeded. 3 check(s) raised warnings. Re-run with --verbose for more details.

abhatikar commented 4 years ago

Its behaving little strange, i cannot see neither size increasing nor decreasing. BTW TTL is 300 now.

philipktlin commented 4 years ago

I am trying to reproduce locally and will get back to you ASAP.

philipktlin commented 4 years ago

I ran iotedged 1.0.9.4 with edgeAgent and edgeHub 1.0 (1.0.9.4) and defined storageFolder env variable and bind mounts, therefore edgeHub data files (rocksdb sst and log files) will write to host storage folder. Then tried to send many messages during offline and after 1+ hour connected back to Internet and stop sending messages. I expected once all message get sent to upstream, the file size of SST in storage/EdgeHub should go down and close to 75M when I first started iotedge up and I used RocksDB_MaxTotalWalSize env variable to control max size for rocksdb log to 100MB. Therefore I expected after testing, edgeHub folder size should be less than or close to 175MB. However it remains at 248M.

@varunpuranik Do you have any idea?

philipktlin commented 4 years ago

I posted a question to Rocksdb forum about data file cleanup / compaction, https://groups.google.com/g/rocksdb/c/8iFxgtWuHNE.

abhatikar commented 4 years ago

Hello Philip,

Just following up on the issue.. Any leads there?

philipktlin commented 4 years ago

Add an issue in rocksdb github repo, https://github.com/facebook/rocksdb/issues/7512.

github-actions[bot] commented 4 years ago

This issue is being marked as stale because it has been open for 30 days with no activity.

veyalla commented 3 years ago

Please check with the latest supported release and open a new issue if the problem protests.