thin-edge / thin-edge.io

The open edge framework for lightweight IoT devices
https://thin-edge.io
Apache License 2.0
223 stars 55 forks source link

te Bridge Health Topic not working #2485

Closed wilbersl closed 1 year ago

wilbersl commented 1 year ago

Describe the bug te/device/main/service/mosquitto-c8y-bridge/status/health Doesn't go to 0 when internet connection is cut. After Reconnecting to the internet it Publishes a 1 sometimes.

To Reproduce

Expected behavior 0 is Published on the Topic in a sensible amount of Time (1 - 5 minutes for me).

Screenshots

Environment (please complete the following information):

Additional context

reubenmiller commented 1 year ago

@wilbersl Generally MQTT relies on the PING message to determine whether the connection is active or not, so when you unplug the network cable, then it can take up to 2 x the PING interval (configured in the mosquitto).

Running the same procedure the bridge health status was sent after about 2 mins (as the default MQTT PING interval is 60 seconds, see the keepalive_interval in mosquitto config.

tedge mqtt sub '#'

Output (showing the mosquitto-c8y-bridge status)

[te/device/main/service/mosquitto-c8y-bridge/status/health] 1

# ...up to 2 mins later

[te/device/main/service/mosquitto-c8y-bridge/status/health] 0

Can you check if the message is eventually sent?

And if you want to make sure you don't lose any messages between unplugging the network cable and until mosquitto recognizes that the network is unavailable, then you definitely need to make sure you publish with QoS >= 1, as QoS 0 messages will be dropped in this period.

wilbersl commented 1 year ago

My Output [te/device/main/service/mosquitto-c8y-bridge/status/health] 1 #wait 5 mins => nothing #replug cable [te/device/main/service/mosquitto-c8y-bridge/status/health] 1

Do I need to configure the keepalive_interval myself? I couldn't find it in /etc/tedge/mosquitto-conf/c8y-bridge.conf.

reubenmiller commented 1 year ago

Can you print out your c8y-bridge config…no you don’t need to set the keep alive interval as the default should be suitable

wilbersl commented 1 year ago

c8y-bridge.conf.txt

reubenmiller commented 1 year ago

The mosquitto logs would also be useful here (you might need to enable this if you haven’t already). But the bridge config looks ok.

also details about which mosquitto version is being used.

wilbersl commented 1 year ago

I used a fresh install of RevPi bullseye lite, only installed ThinEdge via curl -1sLf 'https://dl.cloudsmith.io/public/thinedge/tedge-main/setup.deb.sh' | sudo bash sudo apt-get update sudo apt-get install tedge-full

mosquitto version 2.0.11

mosquitto.log

reubenmiller commented 1 year ago

@wilbersl The attached mosquitto log shows that the mosquitto service was restarted (possible a device restart?).

Does the log include the time period where the ethernet cable was disconnected, if not can you please do the following, the post the mosquitto logs:

  1. Check that the c8y bridge connection is healthy
  2. Disconnect the cable (verify that there is no other wifi connection)
  3. Wait 10 mins
  4. Connect the cable again and wait 5 mins
  5. Check if the c8y bridge connection is healthy
  6. Send the mosquitto logs from the above procedure (highlighting in the log at what timestamp the cable was unplugged/plugged in would be very helpful)
wilbersl commented 1 year ago

Here is the Log with some comments!

#Start ThinEdge
1701434948: mosquitto version 2.0.11 starting
1701434948: Config loaded from /etc/mosquitto/mosquitto.conf.
1701434948: Opening ipv4 listen socket on port 1883.
1701434948: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701434948: mosquitto version 2.0.11 running
1701434948: Connecting bridge (step 2) edge_to_c8y (mqtt.cumulocity.com:8883)
1701434949: New connection from 127.0.0.1:41162 on port 1883.
1701434949: New connection from 127.0.0.1:41170 on port 1883.
1701434949: New client connected from 127.0.0.1:41162 as c8y-firmware-plugin (p2, c0, k60).
1701434949: New client connected from 127.0.0.1:41170 as tedge-sub-9299 (p2, c1, k60).
1701434949: tedge-sub-9299 0 #
1701434949: New connection from 127.0.0.1:41184 on port 1883.
1701434949: New client connected from 127.0.0.1:41184 as tedge-log-plugin#te/device/main// (p2, c0, k60).
1701434949: New connection from 127.0.0.1:41196 on port 1883.
1701434949: New client connected from 127.0.0.1:41196 as tedge-configuration-plugin#te/device/main// (p2, c0, k60).
1701434955: New connection from 127.0.0.1:49838 on port 1883.
1701434955: New client connected from 127.0.0.1:49838 as check_connection_c8y (p2, c1, k10).
1701434955: check_connection_c8y 1 c8y/s/dat
1701434955: Client check_connection_c8y closed its connection.
1701434955: New connection from 127.0.0.1:49850 on port 1883.
1701434955: New connection from 127.0.0.1:49852 on port 1883.
1701434955: New client connected from 127.0.0.1:49850 as tedge-mapper-c8y (p2, c0, k60).
1701434955: New client connected from 127.0.0.1:49852 as hosyshggpg (p2, c1, k60).
1701434955: hosyshggpg 1 c8y/s/dat
1701434955: New connection from 127.0.0.1:49874 on port 1883.
1701434955: New client connected from 127.0.0.1:49874 as last_will_c8y_mapper (p2, c0, k60).
1701434955: tedge-mapper-c8y 1 te/device/main/service/tedge-mapper-c8y/cmd/health/check
1701434955: tedge-mapper-c8y 1 te/device/main///cmd/health/check
1701434955: tedge-mapper-c8y 1 c8y-internal/alarms/+/+/+/+/+/a/+
1701434955: tedge-mapper-c8y 1 c8y/s/ds
1701434955: tedge-mapper-c8y 1 c8y/s/ds
1701434955: tedge-mapper-c8y 1 te/+/+/+/+/cmd/restart/+
1701434955: tedge-mapper-c8y 1 te/+/+/+/+/cmd/restart
1701434955: tedge-mapper-c8y 1 te/+/+/+/+/cmd/software_list/+
1701434955: tedge-mapper-c8y 1 te/+/+/+/+/cmd/software_list
1701434955: tedge-mapper-c8y 1 te/+/+/+/+/cmd/software_update/+
1701434955: tedge-mapper-c8y 1 te/+/+/+/+/cmd/software_update
1701434955: tedge-mapper-c8y 1 te/+/+/+/+/cmd/log_upload/+
1701434955: tedge-mapper-c8y 1 te/+/+/+/+/cmd/log_upload
1701434955: tedge-mapper-c8y 1 te/+/+/+/+/cmd/config_snapshot/+
1701434955: tedge-mapper-c8y 1 te/+/+/+/+/cmd/config_snapshot
1701434955: tedge-mapper-c8y 1 te/+/+/+/+/cmd/config_update/+
1701434955: tedge-mapper-c8y 1 te/+/+/+/+/cmd/config_update
1701434955: tedge-mapper-c8y 1 te/+/+/+/+
1701434955: tedge-mapper-c8y 1 te/+/+/+/+/twin/+
1701434955: tedge-mapper-c8y 1 te/+/+/+/+/m/+
1701434955: tedge-mapper-c8y 1 te/+/+/+/+/e/+
1701434955: tedge-mapper-c8y 1 te/+/+/+/+/a/+
1701434955: tedge-mapper-c8y 1 te/+/+/+/+/status/health
1701434955: tedge-mapper-c8y 1 tedge/commands/res/control/restart
1701434955: tedge-mapper-c8y 1 tedge/commands/res/software/list
1701434955: tedge-mapper-c8y 1 tedge/commands/res/software/update
1701434955: tedge-mapper-c8y 1 te/device/main///cmd/restart/+
1701434955: tedge-mapper-c8y 1 te/device/main///cmd/software_list/+
1701434955: tedge-mapper-c8y 1 te/device/main///cmd/software_update/+
1701434955: Client hosyshggpg disconnected.
1701434956: New connection from 127.0.0.1:49886 on port 1883.
1701434956: New client connected from 127.0.0.1:49886 as get_jwt_token_c8y (p2, c1, k10).
1701434956: get_jwt_token_c8y 1 c8y/s/dat
1701434957: Client get_jwt_token_c8y closed its connection.
1701434957: New connection from 127.0.0.1:49894 on port 1883.
1701434957: New client connected from 127.0.0.1:49894 as tedge-agent#te/device/main// (p2, c0, k60).
1701434957: tedge-agent#te/device/main// 1 te/device/main///cmd/+/+
1701434957: tedge-agent#te/device/main// 1 te/device/main/service/tedge-agent/cmd/health/check
1701434957: tedge-agent#te/device/main// 1 te/device/main///cmd/health/check
1701434957: tedge-agent#te/device/main// 1 tedge/measurements
1701434957: tedge-agent#te/device/main// 1 tedge/measurements/+
1701434957: tedge-agent#te/device/main// 1 tedge/events/+
1701434957: tedge-agent#te/device/main// 1 tedge/events/+/+
1701434957: tedge-agent#te/device/main// 1 tedge/alarms/+/+
1701434957: tedge-agent#te/device/main// 1 tedge/alarms/+/+/+
1701435044: Client tedge-sub-9299 has exceeded timeout, disconnecting.
#Test Publish 1
1701435105: New connection from 127.0.0.1:38178 on port 1883.
1701435105: New client connected from 127.0.0.1:38178 as tedge-pub-26221 (p2, c1, k60).
1701435105: Client tedge-pub-26221 closed its connection.
1701435182: New connection from 127.0.0.1:41280 on port 1883.
1701435182: New client connected from 127.0.0.1:41280 as tedge-sub-26226 (p2, c1, k60).
1701435182: tedge-sub-26226 0 #
#Remove Cable
1701435291: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435297: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435307: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435313: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435328: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435341: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435348: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435354: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435363: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435380: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435403: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435431: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435448: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435468: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435499: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435530: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435561: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435584: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435590: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435598: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435608: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435632: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435639: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435650: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435661: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435685: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435704: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435735: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435762: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435793: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435817: Client tedge-sub-26226 closed its connection.
1701435822: New connection from 127.0.0.1:33658 on port 1883.
1701435822: New client connected from 127.0.0.1:33658 as tedge-sub-26318 (p2, c1, k60).
1701435822: tedge-sub-26318 0 #
1701435824: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435855: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
#Plug Cable
1701435870: Client tedge-sub-26318 closed its connection.
1701435874: New connection from 127.0.0.1:40750 on port 1883.
1701435874: New client connected from 127.0.0.1:40750 as tedge-sub-26380 (p2, c1, k60).
1701435874: tedge-sub-26380 0 #
1701435886: Connecting bridge (step 1) edge_to_c8y (mqtt.cumulocity.com:8883)
1701435886: Connecting bridge (step 2) edge_to_c8y (mqtt.cumulocity.com:8883)
#incoming message [te/device/main/service/mosquitto-c8y-bridge/status/health] 1
1701435970: Client tedge-sub-26380 closed its connection.
#Test Publish 2
1701436209: New connection from 127.0.0.1:35578 on port 1883.
1701436209: New client connected from 127.0.0.1:35578 as tedge-pub-26438 (p2, c1, k60).
1701436209: Client tedge-pub-26438 closed its connection.
#Stop ThinEdge
1701436280: mosquitto version 2.0.11 terminating
1701436280: Saving in-memory database to /var/lib/mosquitto//mosquitto.db.
1701436280: mosquitto version 2.0.11 starting
1701436280: Config loaded from /etc/mosquitto/mosquitto.conf.
1701436280: Opening ipv4 listen socket on port 1883.
1701436280: mosquitto version 2.0.11 running
1701436281: New connection from 127.0.0.1:50466 on port 1883.
1701436281: New connection from 127.0.0.1:50482 on port 1883.
1701436281: New client connected from 127.0.0.1:50466 as c8y-firmware-plugin (p2, c0, k60).
1701436281: New client connected from 127.0.0.1:50482 as tedge-configuration-plugin#te/device/main// (p2, c0, k60).
1701436281: New connection from 127.0.0.1:50488 on port 1883.
1701436281: New client connected from 127.0.0.1:50488 as tedge-log-plugin#te/device/main// (p2, c0, k60).
reubenmiller commented 1 year ago

@wilbersl Thanks for the marked up logs.

After doing some research into the mosquitto repo, the problem seems to be related to these mosquitto bug tickets, so related to mosquitto 2.0.11.

It's unclear on the exact mosquitto version that this is solved in, however I can confirm that using 2.0.18, the bridge topic publishes a 0 with ~2 x the keep alive interval when the cable is disconnected.

wilbersl commented 1 year ago

I can confirm that using mosquitto 2.0.18 resolved the issue. Thank you for your help @reubenmiller !