pushy / pushy-flutter

The official Pushy SDK for Flutter apps.
Apache License 2.0
21 stars 19 forks source link

(Android) Pushy sends keep-alive every 15 seconds after failed reconnect #31

Closed ryanheise closed 3 years ago

ryanheise commented 3 years ago

We're getting an alert on Android vitals that our production app is waking up the device too frequently and draining the battery, in some cases. This was based on the 1.1.9 release of this plugin, but the 2.0.0 release is also based on the same underlying Android pushy sdk version so I assume the problem will also happen with it.

The following adb log was captured when trying to reproduce on our app in debug mode, after the app had already been running for 2 hours in the background (probably undergoing several some dozes and/or wifi disconnections):

Click to view logs

``` 05-19 22:22:44.418 14174 14963 D Pushy : Connection lost 05-19 22:25:35.798 14174 15972 D Pushy : PushyJobService: Sending keep alive 05-19 22:30:45.509 14174 16318 D Pushy : PushyJobService: Sending keep alive 05-19 22:34:05.189 14174 14956 D Pushy : Connection lost 05-19 22:34:05.217 14174 16422 D Pushy : PushyJobService: Connecting... 05-19 22:34:05.220 14174 16422 D Pushy : Broker: ... 05-19 22:34:05.220 14174 16422 D Pushy : Device Token: ... 05-19 22:34:05.220 14174 16422 D Pushy : Device Auth Key: ... 05-19 22:34:05.257 14174 16422 D Pushy : Connect exception: (0) - java.net.UnknownHostException: mqtt-... 05-19 22:34:05.257 14174 16422 D Pushy : Reconnecting in 1000ms 05-19 22:41:27.343 14174 16422 D Pushy : PushyJobService: Connecting... 05-19 22:41:27.344 14174 16422 D Pushy : Broker: ... 05-19 22:41:27.344 14174 16422 D Pushy : Device Token: ... 05-19 22:41:27.344 14174 16422 D Pushy : Device Auth Key: ... 05-19 22:41:31.588 14174 16422 D Pushy : Connected successfully (sending keep alive every 300 seconds) 05-19 22:41:31.642 14174 16834 D Pushy : Connection lost 05-19 22:41:31.667 14174 16841 D Pushy : PushyJobService: Connecting... 05-19 22:41:31.667 14174 16841 D Pushy : Broker: ... 05-19 22:41:31.667 14174 16841 D Pushy : Device Token: ... 05-19 22:41:31.667 14174 16841 D Pushy : Device Auth Key: ... 05-19 22:41:31.690 14174 16841 D Pushy : Connect exception: (0) - java.net.UnknownHostException: mqtt-... 05-19 22:41:31.690 14174 16841 D Pushy : Reconnecting in 1000ms 05-19 22:55:39.734 14174 16422 D Pushy : PushyJobService: Connecting... 05-19 22:55:39.734 14174 16422 D Pushy : Broker: ... 05-19 22:55:39.734 14174 16422 D Pushy : Device Token: ... 05-19 22:55:39.734 14174 16422 D Pushy : Device Auth Key: ... 05-19 22:55:43.000 14174 16422 D Pushy : Connected successfully (sending keep alive every 300 seconds) 05-19 22:55:43.058 14174 17099 D Pushy : Connection lost 05-19 22:55:43.087 14174 17113 D Pushy : PushyJobService: Connecting... 05-19 22:55:43.087 14174 17113 D Pushy : Broker: ... 05-19 22:55:43.088 14174 17113 D Pushy : Device Token: ... 05-19 22:55:43.088 14174 17113 D Pushy : Device Auth Key: ... 05-19 22:55:43.110 14174 17113 D Pushy : Connect exception: (0) - java.net.UnknownHostException: mqtt-... 05-19 22:55:43.110 14174 17113 D Pushy : Reconnecting in 1000ms 05-19 23:15:22.084 14174 17113 D Pushy : PushyJobService: Connecting... 05-19 23:15:22.084 14174 17113 D Pushy : Broker: ... 05-19 23:15:22.084 14174 17113 D Pushy : Device Token: ... 05-19 23:15:22.084 14174 17113 D Pushy : Device Auth Key: ... 05-19 23:15:26.944 14174 17113 D Pushy : Connected successfully (sending keep alive every 300 seconds) 05-19 23:20:30.374 14174 18217 D Pushy : PushyJobService: Sending keep alive 05-19 23:25:35.469 14174 18323 D Pushy : PushyJobService: Sending keep alive 05-19 23:30:46.002 14174 18473 D Pushy : PushyJobService: Sending keep alive 05-19 23:33:30.736 1681 1850 I ActivityManager: Start proc 21101:me.pushy.example/u0a391 for service {me.pushy.example/me.pushy.sdk.services.PushyJobService} 05-19 23:33:31.722 21101 21101 I e.pushy.exampl: Late-enabling -Xcheck:jni 05-19 23:33:31.887 21101 21101 I e.pushy.exampl: Unquickening 12 vdex files! 05-19 23:33:33.165 21101 21101 I TetheringManager: registerTetheringEventCallback:me.pushy.example 05-19 23:33:33.173 21101 21150 D Pushy : PushyJobService: Connecting... 05-19 23:33:33.292 21101 21150 D Pushy : Broker: ... 05-19 23:33:33.300 21101 21150 D Pushy : Device Token: ... 05-19 23:33:33.301 21101 21150 D Pushy : Device Auth Key: ... 05-19 23:33:54.396 21101 21150 D Pushy : Connect exception: (32000) 05-19 23:33:54.396 21101 21150 D Pushy : (32000) 05-19 23:33:54.396 21101 21150 D Pushy : at me.pushy.sdk.lib.paho.internal.Token.waitForCompletion(Token.java:112) 05-19 23:33:54.396 21101 21150 D Pushy : at me.pushy.sdk.lib.paho.MqttToken.waitForCompletion(MqttToken.java:67) 05-19 23:33:54.396 21101 21150 D Pushy : at me.pushy.sdk.lib.paho.MqttClient.connect(MqttClient.java:241) 05-19 23:33:54.396 21101 21150 D Pushy : at me.pushy.sdk.util.PushyMqttConnection.connect(PushyMqttConnection.java:159) 05-19 23:33:54.396 21101 21150 D Pushy : at me.pushy.sdk.services.PushyJobService$ConnectAsync.doInBackground(PushyJobService.java:221) 05-19 23:33:54.396 21101 21150 D Pushy : at me.pushy.sdk.services.PushyJobService$ConnectAsync.doInBackground(PushyJobService.java:203) 05-19 23:33:54.396 21101 21150 D Pushy : at android.os.AsyncTask$3.call(AsyncTask.java:394) 05-19 23:33:54.396 21101 21150 D Pushy : at java.util.concurrent.FutureTask.run(FutureTask.java:266) 05-19 23:33:54.396 21101 21150 D Pushy : at android.os.AsyncTask$SerialExecutor$1.run(AsyncTask.java:305) 05-19 23:33:54.396 21101 21150 D Pushy : at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167) 05-19 23:33:54.396 21101 21150 D Pushy : at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641) 05-19 23:33:54.396 21101 21150 D Pushy : at java.lang.Thread.run(Thread.java:923) 05-19 23:33:54.397 21101 21150 D Pushy : Reconnecting in 1000ms 05-19 23:34:26.151 21101 21150 D Pushy : PushyJobService: Sending keep alive 05-19 23:34:41.752 21101 21846 D Pushy : PushyJobService: Sending keep alive 05-19 23:35:00.069 21101 21846 D Pushy : PushyJobService: Sending keep alive 05-19 23:35:15.343 21101 21846 D Pushy : PushyJobService: Sending keep alive 05-19 23:35:32.917 21101 22010 D Pushy : PushyJobService: Sending keep alive 05-19 23:35:48.575 14174 22014 D Pushy : PushyJobService: Sending keep alive 05-19 23:35:49.072 21101 22010 D Pushy : PushyJobService: Sending keep alive 05-19 23:36:04.435 21101 22010 D Pushy : PushyJobService: Sending keep alive 05-19 23:36:19.703 21101 22010 D Pushy : PushyJobService: Sending keep alive 05-19 23:36:35.057 21101 22074 D Pushy : PushyJobService: Sending keep alive 05-19 23:36:52.568 21101 22080 D Pushy : PushyJobService: Sending keep alive 05-19 23:37:07.908 21101 22084 D Pushy : PushyJobService: Sending keep alive 05-19 23:37:23.297 21101 22086 D Pushy : PushyJobService: Sending keep alive 05-19 23:37:38.667 21101 22088 D Pushy : PushyJobService: Sending keep alive 05-19 23:37:54.015 21101 22092 D Pushy : PushyJobService: Sending keep alive 05-19 23:38:09.397 21101 22097 D Pushy : PushyJobService: Sending keep alive 05-19 23:38:24.742 21101 22100 D Pushy : PushyJobService: Sending keep alive 05-19 23:38:40.094 21101 22100 D Pushy : PushyJobService: Sending keep alive 05-19 23:38:55.459 21101 22108 D Pushy : PushyJobService: Sending keep alive 05-19 23:39:10.825 21101 22113 D Pushy : PushyJobService: Sending keep alive 05-19 23:39:27.007 21101 22115 D Pushy : PushyJobService: Sending keep alive 05-19 23:39:42.307 21101 22121 D Pushy : PushyJobService: Sending keep alive 05-19 23:39:57.725 21101 22129 D Pushy : PushyJobService: Sending keep alive 05-19 23:40:13.083 21101 22132 D Pushy : PushyJobService: Sending keep alive 05-19 23:40:30.319 21101 22134 D Pushy : PushyJobService: Sending keep alive 05-19 23:40:45.645 21101 22139 D Pushy : PushyJobService: Sending keep alive 05-19 23:40:50.272 14174 22136 D Pushy : PushyJobService: Sending keep alive 05-19 23:41:05.955 21101 22139 D Pushy : PushyJobService: Sending keep alive 05-19 23:41:27.449 21101 22139 D Pushy : PushyJobService: Sending keep alive 05-19 23:41:48.322 21101 22300 D Pushy : PushyJobService: Sending keep alive 05-19 23:42:04.069 21101 22314 D Pushy : PushyJobService: Sending keep alive 05-19 23:42:19.371 21101 22325 D Pushy : PushyJobService: Sending keep alive 05-19 23:42:36.231 21101 22327 D Pushy : PushyJobService: Sending keep alive 05-19 23:42:51.507 21101 22336 D Pushy : PushyJobService: Sending keep alive 05-19 23:43:06.865 21101 22343 D Pushy : PushyJobService: Sending keep alive 05-19 23:43:22.209 21101 22369 D Pushy : PushyJobService: Sending keep alive 05-19 23:43:37.515 21101 22426 D Pushy : PushyJobService: Sending keep alive 05-19 23:43:52.823 21101 22433 D Pushy : PushyJobService: Sending keep alive 05-19 23:44:09.923 21101 22435 D Pushy : PushyJobService: Sending keep alive 05-19 23:44:28.077 21101 22435 D Pushy : PushyJobService: Sending keep alive 05-19 23:44:43.515 21101 22445 D Pushy : PushyJobService: Sending keep alive 05-19 23:44:58.859 21101 22450 D Pushy : PushyJobService: Sending keep alive 05-19 23:45:14.224 21101 22454 D Pushy : PushyJobService: Sending keep alive 05-19 23:45:29.536 21101 22458 D Pushy : PushyJobService: Sending keep alive 05-19 23:45:44.848 21101 22461 D Pushy : PushyJobService: Sending keep alive 05-19 23:45:58.510 14174 22463 D Pushy : PushyJobService: Sending keep alive 05-19 23:46:01.139 21101 22464 D Pushy : PushyJobService: Sending keep alive 05-19 23:46:16.435 21101 22468 D Pushy : PushyJobService: Sending keep alive 05-19 23:46:32.748 21101 22470 D Pushy : PushyJobService: Sending keep alive 05-19 23:46:48.049 21101 22470 D Pushy : PushyJobService: Sending keep alive 05-19 23:47:04.418 21101 22475 D Pushy : PushyJobService: Sending keep alive 05-19 23:47:19.753 21101 22481 D Pushy : PushyJobService: Sending keep alive 05-19 23:47:35.033 21101 22484 D Pushy : PushyJobService: Sending keep alive 05-19 23:47:53.155 21101 22493 D Pushy : PushyJobService: Sending keep alive 05-19 23:48:10.767 21101 22493 D Pushy : PushyJobService: Sending keep alive 05-19 23:48:29.248 21101 22520 D Pushy : PushyJobService: Sending keep alive 05-19 23:48:44.570 21101 22523 D Pushy : PushyJobService: Sending keep alive 05-19 23:48:59.904 21101 22689 D Pushy : PushyJobService: Sending keep alive 05-19 23:49:16.535 21101 22702 D Pushy : PushyJobService: Sending keep alive 05-19 23:49:32.897 21101 22711 D Pushy : PushyJobService: Sending keep alive 05-19 23:49:48.265 21101 22747 D Pushy : PushyJobService: Sending keep alive 05-19 23:50:04.548 21101 22775 D Pushy : PushyJobService: Sending keep alive 05-19 23:50:20.717 21101 22795 D Pushy : PushyJobService: Sending keep alive 05-19 23:50:38.338 21101 22806 D Pushy : PushyJobService: Sending keep alive 05-19 23:50:55.846 21101 22810 D Pushy : PushyJobService: Sending keep alive 05-19 23:50:59.871 14174 22812 D Pushy : PushyJobService: Sending keep alive 05-19 23:51:11.135 21101 22810 D Pushy : PushyJobService: Sending keep alive 05-19 23:51:29.326 21101 22810 D Pushy : PushyJobService: Sending keep alive 05-19 23:51:44.621 21101 22810 D Pushy : PushyJobService: Sending keep alive 05-19 23:51:59.943 21101 22873 D Pushy : PushyJobService: Sending keep alive 05-19 23:52:15.286 21101 22873 D Pushy : PushyJobService: Sending keep alive 05-19 23:52:40.091 21101 22881 D Pushy : PushyJobService: Sending keep alive 05-19 23:52:55.409 21101 22881 D Pushy : PushyJobService: Sending keep alive 05-19 23:53:20.068 21101 22893 D Pushy : PushyJobService: Sending keep alive 05-19 23:53:35.991 21101 22929 D Pushy : PushyJobService: Sending keep alive 05-19 23:53:51.372 21101 22934 D Pushy : PushyJobService: Sending keep alive 05-19 23:54:06.686 21101 22936 D Pushy : PushyJobService: Sending keep alive 05-19 23:54:22.073 21101 22942 D Pushy : PushyJobService: Sending keep alive 05-19 23:54:38.533 21101 22947 D Pushy : PushyJobService: Sending keep alive 05-19 23:54:53.974 21101 22981 D Pushy : PushyJobService: Sending keep alive 05-19 23:55:09.247 21101 23018 D Pushy : PushyJobService: Sending keep alive 05-19 23:55:24.539 21101 23060 D Pushy : PushyJobService: Sending keep alive 05-19 23:55:39.907 21101 23063 D Pushy : PushyJobService: Sending keep alive 05-19 23:55:56.276 21101 23084 D Pushy : PushyJobService: Sending keep alive 05-19 23:56:07.391 14174 23088 D Pushy : PushyJobService: Sending keep alive 05-19 23:56:14.209 21101 23084 D Pushy : PushyJobService: Sending keep alive 05-19 23:56:29.501 21101 23091 D Pushy : PushyJobService: Sending keep alive 05-19 23:56:44.821 21101 23093 D Pushy : PushyJobService: Sending keep alive 05-19 23:57:00.210 21101 23102 D Pushy : PushyJobService: Sending keep alive 05-19 23:57:15.484 21101 23121 D Pushy : PushyJobService: Sending keep alive 05-19 23:57:31.254 21101 23128 D Pushy : PushyJobService: Sending keep alive 05-19 23:57:46.534 21101 23140 D Pushy : PushyJobService: Sending keep alive 05-19 23:58:01.815 21101 23149 D Pushy : PushyJobService: Sending keep alive 05-19 23:58:17.104 21101 23177 D Pushy : PushyJobService: Sending keep alive 05-19 23:58:34.891 21101 23200 D Pushy : PushyJobService: Sending keep alive 05-19 23:58:50.175 21101 23200 D Pushy : PushyJobService: Sending keep alive 05-19 23:59:05.468 21101 23207 D Pushy : PushyJobService: Sending keep alive 05-19 23:59:20.781 21101 23211 D Pushy : PushyJobService: Sending keep alive 05-19 23:59:43.073 21101 23214 D Pushy : PushyJobService: Sending keep alive 05-19 23:59:58.425 21101 23222 D Pushy : PushyJobService: Sending keep alive ```

(I've omitted lines like dropping data for dimension key ... from the above log which were numerous.)

Our heartbeat interval is the default of 5 minutes, and this was indeed the frequency of keep-alive messages logged right after the app was launched.

I could not find the git repository for pushy-sdk-android to debug this but one theory is that each failed reconnect potentially causes a new alarm to be set so we end up with multiple alarms running in parallel, increasing the effective frequency. That theory could be wrong, perhaps there is some significance to the "15 second" interval. I was testing this on a Google Pixel 3a (i.e. not one of the Chinese-manufactured devices listed as requiring more frequent keep-alive intervals.)

On a side note, I have been following Google's guide to gather more information on this however it is also difficult to monitor the alarm usage since Pushy doesn't set a tag on the alarm (See best practices ).

pushy commented 3 years ago

Hi @ryanheise, Thanks so much for the detailed bug report. We'd be glad to help with investigation and resolving this issue.

Based on the logs, it appears PID 21101 belongs to our demo app: me.pushy.example. It appears the issue was occurring for our demo app, and not for your own app (with PID 14174). Your own app, with PID 14174, was still properly sending keepalives every 5 minutes, even when the Pushy Demo app went into a craze and started sending out keep alives every 15 seconds.

1) Can you confirm PID 14174 was your own app? 2) For some reason, there is no log of PID 21101 (Pushy Demo app) ever connecting to our service (PushyJobService: Connecting... log) before it stated sending keepalives uncontrollably. Were any of those logs omitted?

05-19 23:33:54.397 21101 21150 D Pushy   : Reconnecting in 1000ms
05-19 23:34:26.151 21101 21150 D Pushy   : PushyJobService: Sending keep alive

3) We found two potential code paths that may have led to the 15-second "Sending keep alive" bug, and rectified them. We'll be releasing a new SDK version today/tomorrow and hopefully it will resolve the issue.

4) Do you by any chance have any reproduction steps, or is this a really rare state of the SDK?

pushy commented 3 years ago

HI @ryanheise, We've just released version 2.0.1 of pushy_flutter which references Pushy Android SDK version 1.0.70, which includes the two attempted code path fixes.

1) Please edit the pubspec.yaml in the root directory of your project and update the plugin version referenced to the following:

pushy_flutter: 2.0.1

2) Run flutter pub get to fetch the new version.

3) Run your app and observe whether the SDK ever enters this state ever again.

Thanks for your help!

ryanheise commented 3 years ago

Thanks for the speedy response! It may take us some time before the next release when we can get some new numbers for vitals, but we'll keep an eye on them.

In resolving the two potential code paths you mentioned, are you able to shed any light on how one could reproduce this issue for testing purposes?

pushy commented 3 years ago

Hi @ryanheise, Unfortunately, we weren't able to reproduce the issue. The two fixes performed aim to prevent a scenario where the SDK has not yet configured a keep alive interval, but the connection for some reason is in an active state. Hopefully, the fixes are effective.