owntracks / android

OwnTracks Android App
http://owntracks.org
Eclipse Public License 1.0
1.38k stars 476 forks source link

MQTT Exception delivering message (32000) #1653

Closed LinusNil closed 4 months ago

LinusNil commented 8 months ago

Hi It happens repeatedly that the endpoint queue is growing, even though the "Endpoint state" says "Connected" and I have a really good Internet connection. This is not resolved by restarting the application. The only way I can get it working again is activating/deactivating a Wireguard VPN tunnel I am using. I guess the only reason this works is because it triggers/forces some action in Owntracks making it reconnect to the HiveMQ broker.

I do love the app and I am planning on using it for a very good purpose, so I am eager to find a solution to this issue.

Log

2024-03-20 20:29:25.804 E MessageProcessor: Message delivery failed. queueLength: 3, messageId: 1710962692651-a0f58f
2024-03-20 20:29:25.807 W MessageProcessor: Error sending message. Re-queueing
2024-03-20 20:29:25.814 I MessageProcessor: Waiting for 64 s before retrying
2024-03-20 20:29:53.372 D zzl: location result received: LocationResult(lastLocation=Location[fused 58.128417,13.317405 hAcc=899.999 et=+14d8h20m7s334ms alt=209.25627729008312 vAcc=17.081835])
2024-03-20 20:29:58.401 D CachingGeocoder: Geocode cache: hits=1290, misses=16729
2024-03-20 20:30:04.013 D zzl: location result received: LocationResult(lastLocation=Location[fused 58.124546,13.299241 hAcc=2799.999 et=+14d8h20m17s350ms alt=208.5728795273598 vAcc=17.19073])
2024-03-20 20:30:04.021 W DeviceGeocoder: Rate-limited, not querying until 2024-03-20T19:30:58.400Z
2024-03-20 20:30:04.022 D CachingGeocoder: Geocode cache: hits=1290, misses=16730
2024-03-20 20:30:27.429 D zzl: location availability LocationAvailability(locationAvailable=false)
2024-03-20 20:30:29.829 D MessageProcessorEndpointMqtt: Sending message Location id=1710962692651-a0f58f: (58.0781133,13.0192388). Thread: Thread[networkHandlerThread,5,main]
2024-03-20 20:30:29.831 D MessageProcessorEndpointMqtt: already connected to broker
2024-03-20 20:30:59.834 E MessageProcessorEndpointMqtt: MQTT Exception delivering message
 (32000)
    at org.eclipse.paho.client.mqttv3.internal.Token.waitForCompletion(SourceFile:225)
    at org.owntracks.android.services.MessageProcessorEndpointMqtt.sendMessage(SourceFile:74)
    at org.owntracks.android.services.MessageProcessor.lambda$sendAvailableMessages$0(Unknown Source:2)
    at org.owntracks.android.services.MessageProcessor.$r8$lambda$bVYaIhYhXtJCzBaSxWOGtcUu6f8(Unknown Source:0)
    at org.owntracks.android.services.MessageProcessor$$ExternalSyntheticLambda1.call(Unknown Source:4)
    at java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at android.os.Handler.handleCallback(Handler.java:942)
    at android.os.Handler.dispatchMessage(Handler.java:99)
    at android.os.Looper.loopOnce(Looper.java:226)
    at android.os.Looper.loop(Looper.java:313)
    at android.os.HandlerThread.run(HandlerThread.java:67)

2024-03-20 20:30:59.835 E MessageProcessor: Message delivery failed. queueLength: 3, messageId: 1710962692651-a0f58f
2024-03-20 20:30:59.836 W MessageProcessor: Error sending message. Re-queueing
2024-03-20 20:30:59.844 I MessageProcessor: Waiting for 120 s before retrying
2024-03-20 20:31:02.035 D zzl: location availability LocationAvailability(locationAvailable=true)
2024-03-20 20:31:02.055 D zzl: location result received: LocationResult(lastLocation=Location[fused 58.140051,13.347753 hAcc=1299.999 et=+14d8h21m17s499ms alt=216.31714573237832 vAcc=17.777134])
2024-03-20 20:31:02.479 D DeviceGeocoder: Resolved 58.1401,13.3478 to Sörbyvägen 44, 521 52 Floby, Sverige
2024-03-20 20:31:02.481 D CachingGeocoder: Geocode cache: hits=1290, misses=16731
2024-03-20 20:31:13.674 D zzl: location result received: LocationResult(lastLocation=Location[fused 58.143514,13.367141 hAcc=56.1 et=+14d8h21m28s469ms alt=219.08467536939006 vAcc=17.892237])
2024-03-20 20:31:14.072 D DeviceGeocoder: Resolved 58.1435,13.3671 to Sörby 1, 521 56 Floby, Sverige
2024-03-20 20:31:14.073 D CachingGeocoder: Geocode cache: hits=1290, misses=16732
2024-03-20 20:31:24.526 D zzl: location result received: LocationResult(lastLocation=Location[fused 58.144655,13.371882 hAcc=110.0 et=+14d8h21m38s149ms alt=222.00452015452788 vAcc=11.538617])
2024-03-20 20:31:25.673 D DeviceGeocoder: Resolved 58.1447,13.3719 to Sörby 1, 521 56 Floby, Sverige
2024-03-20 20:31:25.675 D CachingGeocoder: Geocode cache: hits=1290, misses=16733
2024-03-20 20:32:44.309 D zzl: location result received: LocationResult(lastLocation=Location[fused 58.150554,13.428953 hAcc=110.0 et=+14d8h22m58s743ms alt=252.75724228346846 vAcc=11.48749])
2024-03-20 20:32:44.686 D DeviceGeocoder: Resolved 58.1506,13.4290 to Odensberg 1, 521 92 Falköping, Sverige
2024-03-20 20:32:44.688 D CachingGeocoder: Geocode cache: hits=1290, misses=16734
2024-03-20 20:32:59.855 D MessageProcessorEndpointMqtt: Sending message Location id=1710962692651-a0f58f: (58.0781133,13.0192388). Thread: Thread[networkHandlerThread,5,main]
2024-03-20 20:32:59.857 D MessageProcessorEndpointMqtt: already connected to broker
2024-03-20 20:33:04.357 D zzl: location result received: LocationResult(lastLocation=Location[fused 58.150355,13.440664 hAcc=104.1 et=+14d8h23m17s774ms alt=257.0778132351661 vAcc=11.44303])
2024-03-20 20:33:07.584 D DeviceGeocoder: Resolved 58.1504,13.4407 to ODENSBERGS GÅRD 1, 521 92 Falköping, Sverige
2024-03-20 20:33:07.585 D CachingGeocoder: Geocode cache: hits=1290, misses=16735
2024-03-20 20:33:14.320 D zzl: location result received: LocationResult(lastLocation=Location[fused 58.155136,13.447344 hAcc=98.4 et=+14d8h23m29s956ms alt=255.39929542083553 vAcc=11.433365])
2024-03-20 20:33:14.677 D DeviceGeocoder: Resolved 58.1551,13.4473 to Gullvivevägen 18, 521 92 Falköping, Sverige
2024-03-20 20:33:14.678 D CachingGeocoder: Geocode cache: hits=1290, misses=16736
2024-03-20 20:33:29.859 E MessageProcessorEndpointMqtt: MQTT Exception delivering message
 (32000)
    at org.eclipse.paho.client.mqttv3.internal.Token.waitForCompletion(SourceFile:225)
    at org.owntracks.android.services.MessageProcessorEndpointMqtt.sendMessage(SourceFile:74)
    at org.owntracks.android.services.MessageProcessor.lambda$sendAvailableMessages$0(Unknown Source:2)
    at org.owntracks.android.services.MessageProcessor.$r8$lambda$bVYaIhYhXtJCzBaSxWOGtcUu6f8(Unknown Source:0)
    at org.owntracks.android.services.MessageProcessor$$ExternalSyntheticLambda1.call(Unknown Source:4)
    at java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at android.os.Handler.handleCallback(Handler.java:942)
    at android.os.Handler.dispatchMessage(Handler.java:99)
    at android.os.Looper.loopOnce(Looper.java:226)
    at android.os.Looper.loop(Looper.java:313)
    at android.os.HandlerThread.run(HandlerThread.java:67)

2024-03-20 20:33:29.860 E MessageProcessor: Message delivery failed. queueLength: 3, messageId: 1710962692651-a0f58f
2024-03-20 20:33:29.860 W MessageProcessor: Error sending message. Re-queueing
2024-03-20 20:33:29.867 I MessageProcessor: Waiting for 120 s before retrying
2024-03-20 20:33:30.316 D MapFragmentFactory: Instantiating Fragment for org.owntracks.android.ui.map.MapFragment
2024-03-20 20:33:30.351 D MapViewModel: setting view mode: VIEW_DEVICE
2024-03-20 20:33:30.528 D GoogleMapFragment: Maps SDK initialized with renderer: LEGACY
2024-03-20 20:33:30.899 D LocationLiveData: Removing previous locationupdate task complete.  Success=true Cancelled=false
2024-03-20 20:33:30.900 D MapActivity$serviceConnection: Service connected to MapActivity
2024-03-20 20:33:30.903 D GoogleMapFragment: Maps SDK initialized with renderer: LEGACY
2024-03-20 20:33:30.903 D GoogleMapFragment: GoogleMapFragment initMap hasLocationCapability=true
2024-03-20 20:33:30.907 D GoogleMapFragment: Drawing regions on map
2024-03-20 20:33:30.907 D MapViewModel: setting view mode: VIEW_DEVICE
2024-03-20 20:33:31.011 D LocationLiveData: LocationLiveData location update request completed: Success=true Cancelled=false
2024-03-20 20:33:33.831 D LocationLiveData: LocationLiveData removing location updates completed: Success=true Cancelled=false
2024-03-20 20:33:41.041 I ServiceStarter$Impl: starting service
2024-03-20 20:33:41.073 D MapFragmentFactory: Instantiating Fragment for org.owntracks.android.ui.map.MapFragment
2024-03-20 20:33:41.074 D MapViewModel: setting view mode: VIEW_DEVICE
2024-03-20 20:33:41.074 E MapViewModel: no location available
2024-03-20 20:33:41.158 D GoogleMapFragment: Maps SDK initialized with renderer: LEGACY
2024-03-20 20:33:41.231 D MapActivity$serviceConnection: Service connected to MapActivity
2024-03-20 20:33:41.231 D LocationLiveData: Removing previous locationupdate task complete.  Success=true Cancelled=false
2024-03-20 20:33:41.232 D GoogleMapFragment: Maps SDK initialized with renderer: LEGACY
2024-03-20 20:33:41.232 D GoogleMapFragment: GoogleMapFragment initMap hasLocationCapability=true
2024-03-20 20:33:41.236 D MapViewModel: setting view mode: VIEW_DEVICE
2024-03-20 20:33:41.236 E MapViewModel: no location available
growse commented 7 months ago

32000 is the MQTT client exception for REASON_CODE_CLIENT_TIMEOUT. The client's not getting a response back from the server. The fact that this is fixed by restarting your wireguard VPN (which I assume is what you're connecting over) hints that your network is dropping the return packets somehow.

Can you test without Wireguard to see if it still behaves the same?

LinusNil commented 7 months ago

I should have been clearer about the vpn. As mentioned the connection issue is resolved by first activating and the deactivating the vpn. In other words; I usually don't use it and even if it is activate, I only route a few LAN subnets through it.

I believe the only thing this does is trigger/force owntracks to reconnect to the broker in some way. I might have come around this issue by using Clean Session, which I believe makes Owntracks open and close the TCP connection repeatedly in some way.

Addition: I still occasionally run into this issue, also with Clean Session activated. I also noticed that if I close the app gracefully (with the Exit App menu option) and then start it, it regains connection to the broker and starts working properly.

growse commented 7 months ago

Network changes do trigger a reconnect, so that's probably why you're seeing a reconnect.

It does smell an awful lot like a middle router (or something) between your device and your broker is stopping packets from flowing at some point after the connection is established. I've seen this before on some ... interesting NAT implementations & firewalls that had a slightly wonky view on what "connected" meant (they only let traffic flow when there was a connection formed by a TCP handshake, and then at some point forgot that happened so dropped further packets in the flow).

Clean session is just a flag sent to the broker on connect to say "don't send me any outstanding Qos 1 or 2 messages that you might have queued from my last session". It doesn't affect reconnects.

There's a 2.5.0 beta out shortly, I'd be interested in whether or not you experience the same thing there.

LinusNil commented 7 months ago

Ok. I got the Clean Session thing wrong then. Thank you for enlightening me on that.

Well, I am behind CGNAT while I'm not on WiFi. Also, my mobile connect to WiFi whenever it is available. Looking forward for updates.

This might be considered a not so desirable workaround, but would it be possible for OwnTracks to simply reinitiate the TCP connection if it runs into these kind of errors for a certain timespan?

growse commented 7 months ago

This might be considered a not so desirable workaround, but would it be possible for OwnTracks to simply reinitiate the TCP connection if it runs into these kind of errors for a certain timespan?

One of the things the new version should enable is setting a keepalive to <900s. In theory (because software), this should mean that if the broker / client doesn't see a ping at least every $keepalive seconds, it'll RST the connection and force a re-connect. It sounds like you'd benefit from having a value closer to around 60s (depending on impact on your battery life).

LinusNil commented 7 months ago

This sounds great. Do you know it this will also be implemented in the iOS version?

jpmens commented 7 months ago

I believe keepalives work fine in iOS.

github-actions[bot] commented 5 months ago

Marking issue with request for more data as stale, due to no updates.