brocaar / chirpstack-network-server

ChirpStack Network Server is an open-source LoRaWAN network-server.
https://www.chirpstack.io
MIT License
1.49k stars 546 forks source link

MQTT connection issues #5

Closed brocaar closed 8 years ago

brocaar commented 8 years ago

After running loraserver for a long time (I'm using the latest https://github.com/eclipse/paho.mqtt.golang version: 4ab3e86, but the problem is not exclusive to this version), it runs into mqtt connection issues. Most of the time, it is able to recover by (automatically) reconnecting and re-subscribing, but sometimes it looses the connection with the mqtt server (Mosquitto 1.4.3 in my case) and stops handling packets. Server is a Digitalocean Ubuntu 15.10 server. I see the same behaviour for the semtech bridge.

One part where the mqtt connection happens is: https://github.com/brocaar/loraserver/blob/master/application/mqttpubsub/backend.go

This problem might be related to https://github.com/eclipse/paho.mqtt.golang/issues/32

Please share your findings!

loraserver logs:

Apr 01 18:10:08 iot loraserver[18659]: time="2016-04-01T18:10:08-04:00" level=info msg="application/mqttpubsub: publishing message" topic="application/0101010101010101/node/0202020202020202/rx"
Apr 01 18:10:08 iot loraserver[18659]: time="2016-04-01T18:10:08-04:00" level=info msg="node-session saved" dev_addr=07e4ac98
Apr 01 18:10:58 iot loraserver[18659]: time="2016-04-01T18:10:58-04:00" level=error msg="application/mqttpubsub: mqtt connection error: pingresp not received, disconnecting"
Apr 01 18:10:58 iot loraserver[18659]: time="2016-04-01T18:10:58-04:00" level=info msg="application/mqttpubsub: connected to mqtt server"
Apr 01 18:10:58 iot loraserver[18659]: time="2016-04-01T18:10:58-04:00" level=info msg="application/mqttpubsub: subscribing to tx topic" topic="application/+/node/+/tx"
Apr 01 18:11:08 iot loraserver[18659]: time="2016-04-01T18:11:08-04:00" level=error msg="application/mqttpubsub: mqtt connection error: pingresp not received, disconnecting"
Apr 01 18:11:08 iot loraserver[18659]: time="2016-04-01T18:11:08-04:00" level=info msg="application/mqttpubsub: connected to mqtt server"
Apr 01 18:11:08 iot loraserver[18659]: time="2016-04-01T18:11:08-04:00" level=info msg="application/mqttpubsub: subscribing to tx topic" topic="application/+/node/+/tx"
Apr 01 18:11:10 iot loraserver[18659]: time="2016-04-01T18:11:10-04:00" level=info msg="packet(s) collected" gw_count=1 gw_macs=1dee08d0b691d149 mtype=UnconfirmedDataUp
Apr 01 18:11:10 iot loraserver[18659]: time="2016-04-01T18:11:10-04:00" level=info msg="application/mqttpubsub: publishing message" topic="application/0101010101010101/node/0202020202020202/rx"
Apr 01 18:11:10 iot loraserver[18659]: time="2016-04-01T18:11:10-04:00" level=info msg="node-session saved" dev_addr=07e4ac98
Apr 01 18:12:12 iot loraserver[18659]: time="2016-04-01T18:12:12-04:00" level=info msg="packet(s) collected" gw_count=1 gw_macs=1dee08d0b691d149 mtype=UnconfirmedDataUp
Apr 01 18:12:12 iot loraserver[18659]: time="2016-04-01T18:12:12-04:00" level=info msg="application/mqttpubsub: publishing message" topic="application/0101010101010101/node/0202020202020202/rx"
Apr 01 18:12:12 iot loraserver[18659]: time="2016-04-01T18:12:12-04:00" level=info msg="node-session saved" dev_addr=07e4ac98
Apr 01 18:13:15 iot loraserver[18659]: time="2016-04-01T18:13:15-04:00" level=info msg="packet(s) collected" gw_count=1 gw_macs=1dee08d0b691d149 mtype=UnconfirmedDataUp
Apr 01 18:13:15 iot loraserver[18659]: time="2016-04-01T18:13:15-04:00" level=info msg="application/mqttpubsub: publishing message" topic="application/0101010101010101/node/0202020202020202/rx"
Apr 01 18:13:15 iot loraserver[18659]: time="2016-04-01T18:13:15-04:00" level=info msg="node-session saved" dev_addr=07e4ac98
Apr 01 18:14:17 iot loraserver[18659]: time="2016-04-01T18:14:17-04:00" level=info msg="packet(s) collected" gw_count=1 gw_macs=1dee08d0b691d149 mtype=UnconfirmedDataUp
Apr 01 18:14:17 iot loraserver[18659]: time="2016-04-01T18:14:17-04:00" level=info msg="application/mqttpubsub: publishing message" topic="application/0101010101010101/node/0202020202020202/rx"
Apr 01 18:14:17 iot loraserver[18659]: time="2016-04-01T18:14:17-04:00" level=info msg="node-session saved" dev_addr=07e4ac98
Apr 01 18:21:27 iot loraserver[18659]: time="2016-04-01T18:21:27-04:00" level=error msg="application/mqttpubsub: mqtt connection error: pingresp not received, disconnecting"
Apr 01 18:21:27 iot loraserver[18659]: time="2016-04-01T18:21:27-04:00" level=info msg="application/mqttpubsub: connected to mqtt server"
Apr 01 18:21:27 iot loraserver[18659]: time="2016-04-01T18:21:27-04:00" level=info msg="application/mqttpubsub: subscribing to tx topic" topic="application/+/node/+/tx"
Apr 01 18:21:37 iot loraserver[18659]: time="2016-04-01T18:21:37-04:00" level=error msg="application/mqttpubsub: mqtt connection error: pingresp not received, disconnecting"
Apr 01 18:21:37 iot loraserver[18659]: time="2016-04-01T18:21:37-04:00" level=info msg="application/mqttpubsub: connected to mqtt server"
Apr 01 18:21:37 iot loraserver[18659]: time="2016-04-01T18:21:37-04:00" level=info msg="application/mqttpubsub: subscribing to tx topic" topic="application/+/node/+/tx"
Apr 01 18:50:04 iot loraserver[18659]: time="2016-04-01T18:50:04-04:00" level=error msg="gateway/mqttpubsub: mqtt connection error: pingresp not received, disconnecting"
Apr 01 18:50:04 iot loraserver[18659]: time="2016-04-01T18:50:04-04:00" level=info msg="gateway/mqttpubsub: connected to mqtt server"
Apr 01 18:50:04 iot loraserver[18659]: time="2016-04-01T18:50:04-04:00" level=info msg="gateway/mqttpubsub: subscribing to rx topic" topic="gateway/+/rx"
Apr 01 18:50:14 iot loraserver[18659]: time="2016-04-01T18:50:14-04:00" level=error msg="gateway/mqttpubsub: mqtt connection error: pingresp not received, disconnecting"
Apr 01 18:50:14 iot loraserver[18659]: time="2016-04-01T18:50:14-04:00" level=info msg="gateway/mqttpubsub: connected to mqtt server"
Apr 01 18:50:14 iot loraserver[18659]: time="2016-04-01T18:50:14-04:00" level=info msg="gateway/mqttpubsub: subscribing to rx topic" topic="gateway/+/rx"
Apr 01 19:16:44 iot loraserver[18659]: time="2016-04-01T19:16:44-04:00" level=error msg="gateway/mqttpubsub: mqtt connection error: pingresp not received, disconnecting"
Apr 01 19:16:44 iot loraserver[18659]: time="2016-04-01T19:16:44-04:00" level=info msg="gateway/mqttpubsub: connected to mqtt server"
Apr 01 19:16:44 iot loraserver[18659]: time="2016-04-01T19:16:44-04:00" level=info msg="gateway/mqttpubsub: subscribing to rx topic" topic="gateway/+/rx"
Apr 01 19:16:54 iot loraserver[18659]: time="2016-04-01T19:16:54-04:00" level=error msg="gateway/mqttpubsub: mqtt connection error: pingresp not received, disconnecting"
Apr 01 19:16:54 iot loraserver[18659]: time="2016-04-01T19:16:54-04:00" level=info msg="gateway/mqttpubsub: connected to mqtt server"
Apr 01 19:16:54 iot loraserver[18659]: time="2016-04-01T19:16:54-04:00" level=info msg="gateway/mqttpubsub: subscribing to rx topic" topic="gateway/+/rx"
Apr 01 20:04:24 iot loraserver[18659]: time="2016-04-01T20:04:24-04:00" level=error msg="gateway/mqttpubsub: mqtt connection error: pingresp not received, disconnecting"
Apr 01 20:04:24 iot loraserver[18659]: time="2016-04-01T20:04:24-04:00" level=info msg="gateway/mqttpubsub: connected to mqtt server"
Apr 01 20:04:24 iot loraserver[18659]: time="2016-04-01T20:04:24-04:00" level=info msg="gateway/mqttpubsub: subscribing to rx topic" topic="gateway/+/rx"
Apr 01 20:04:34 iot loraserver[18659]: time="2016-04-01T20:04:34-04:00" level=error msg="gateway/mqttpubsub: mqtt connection error: pingresp not received, disconnecting"
Apr 01 20:04:34 iot loraserver[18659]: time="2016-04-01T20:04:34-04:00" level=info msg="gateway/mqttpubsub: connected to mqtt server"
Apr 01 20:04:34 iot loraserver[18659]: time="2016-04-01T20:04:34-04:00" level=info msg="gateway/mqttpubsub: subscribing to rx topic" topic="gateway/+/rx"
Apr 01 20:34:48 iot loraserver[18659]: time="2016-04-01T20:34:48-04:00" level=error msg="application/mqttpubsub: mqtt connection error: pingresp not received, disconnecting"
Apr 01 20:34:48 iot loraserver[18659]: time="2016-04-01T20:34:48-04:00" level=info msg="application/mqttpubsub: connected to mqtt server"
Apr 01 20:34:48 iot loraserver[18659]: time="2016-04-01T20:34:48-04:00" level=info msg="application/mqttpubsub: subscribing to tx topic" topic="application/+/node/+/tx"
Apr 01 20:34:58 iot loraserver[18659]: time="2016-04-01T20:34:58-04:00" level=error msg="application/mqttpubsub: mqtt connection error: pingresp not received, disconnecting"
Apr 01 20:34:58 iot loraserver[18659]: time="2016-04-01T20:34:58-04:00" level=info msg="application/mqttpubsub: connected to mqtt server"
Apr 01 20:34:58 iot loraserver[18659]: time="2016-04-01T20:34:58-04:00" level=info msg="application/mqttpubsub: subscribing to tx topic" topic="application/+/node/+/tx"
Apr 01 20:59:08 iot loraserver[18659]: time="2016-04-01T20:59:08-04:00" level=error msg="application/mqttpubsub: mqtt connection error: pingresp not received, disconnecting"
Apr 01 20:59:08 iot loraserver[18659]: time="2016-04-01T20:59:08-04:00" level=info msg="application/mqttpubsub: connected to mqtt server"
Apr 01 20:59:08 iot loraserver[18659]: time="2016-04-01T20:59:08-04:00" level=info msg="application/mqttpubsub: subscribing to tx topic" topic="application/+/node/+/tx"
Apr 01 20:59:18 iot loraserver[18659]: time="2016-04-01T20:59:18-04:00" level=error msg="application/mqttpubsub: mqtt connection error: pingresp not received, disconnecting"
Apr 01 20:59:18 iot loraserver[18659]: time="2016-04-01T20:59:18-04:00" level=info msg="application/mqttpubsub: connected to mqtt server"
Apr 01 20:59:18 iot loraserver[18659]: time="2016-04-01T20:59:18-04:00" level=info msg="application/mqttpubsub: subscribing to tx topic" topic="application/+/node/+/tx"
Apr 01 21:13:18 iot loraserver[18659]: time="2016-04-01T21:13:18-04:00" level=error msg="application/mqttpubsub: mqtt connection error: pingresp not received, disconnecting"
...

Mosquitto logs:

1459546993: Socket error on client 885a5fc6-12cb-4249-a7c4-9cb4aaf99d10, disconnecting.
1459546993: New client connected from 127.0.0.1 as c371d40c-876d-4cf1-a900-24539741f56b (c1, k30).
1459547003: Socket error on client c371d40c-876d-4cf1-a900-24539741f56b, disconnecting.
1459547003: New connection from 127.0.0.1 on port 1883.
1459547003: New client connected from 127.0.0.1 as 43e07938-154a-4571-9e14-bb4443cf5138 (c1, k30).
1459547103: New connection from 127.0.0.1 on port 1883.
1459547103: Socket error on client 43e07938-154a-4571-9e14-bb4443cf5138, disconnecting.
1459547103: New client connected from 127.0.0.1 as 518e58fe-f2e5-49ce-b621-cae192e184cc (c1, k30).
1459547113: Socket error on client 518e58fe-f2e5-49ce-b621-cae192e184cc, disconnecting.
1459547113: New connection from 127.0.0.1 on port 1883.
1459547113: New client connected from 127.0.0.1 as 2e11c696-b3c2-436e-82d0-4e20952f5acf (c1, k30).
1459547963: Saving in-memory database to /var/lib/mosquitto/mosquitto.db.
1459548105: Socket error on client 534e8be8-8dbb-4c05-b29d-ae8d13b9a979, disconnecting.
1459548105: New connection from 127.0.0.1 on port 1883.
1459548105: New client connected from 127.0.0.1 as 8664a25e-22e5-4417-89ef-b862e33e2c90 (c1, k30).
1459548115: Socket error on client 8664a25e-22e5-4417-89ef-b862e33e2c90, disconnecting.
1459548115: New connection from 127.0.0.1 on port 1883.
1459548115: New client connected from 127.0.0.1 as 398a4f2e-4b18-4148-a322-176249184796 (c1, k30).
1459548479: New connection from 127.0.0.1 on port 1883.
1459548479: Socket error on client 12220090-cb15-4ba5-9636-b98dc04a1796, disconnecting.
1459548479: New client connected from 127.0.0.1 as 16c9079e-6ef7-4097-81d3-66f517c48ff6 (c1, k30).
1459548489: Socket error on client 16c9079e-6ef7-4097-81d3-66f517c48ff6, disconnecting.
1459548489: New connection from 127.0.0.1 on port 1883.
1459548489: New client connected from 127.0.0.1 as fcc78d39-2fd3-4a17-980a-73d8b290989f (c1, k30).
1459548658: Socket error on client 398a4f2e-4b18-4148-a322-176249184796, disconnecting.
1459548658: New connection from 127.0.0.1 on port 1883.
1459548658: New client connected from 127.0.0.1 as 1a30a813-adaa-4fb6-8581-3be188c482f5 (c1, k30).
1459548668: Socket error on client 1a30a813-adaa-4fb6-8581-3be188c482f5, disconnecting.
1459548668: New connection from 127.0.0.1 on port 1883.
1459548668: New client connected from 127.0.0.1 as 73f2fd18-c10d-4b09-a9e1-c99d293f8e34 (c1, k30).
1459549287: Socket error on client 73f2fd18-c10d-4b09-a9e1-c99d293f8e34, disconnecting.
1459549287: New connection from 127.0.0.1 on port 1883.
1459549287: New client connected from 127.0.0.1 as f7a011f3-a9a0-4e64-98ad-1850233ddbaf (c1, k30).
1459549297: Socket error on client f7a011f3-a9a0-4e64-98ad-1850233ddbaf, disconnecting.
1459549297: New connection from 127.0.0.1 on port 1883.
1459549297: New client connected from 127.0.0.1 as 314ef86a-9fdb-4ce1-92a6-6d2844a38567 (c1, k30).
...
brocaar commented 8 years ago

This issue is hopefully fixed in the updated mqtt package.

brocaar commented 8 years ago

There are still connection problems in the paho package, but both the latest lora-semtech-bridge and loraserver are able to recover from connection issues. I'll close this issue for now and will keep an eye on the paho mqtt client updates.