Closed chenxiaolong closed 8 months ago
Hmm, it looks like the app is getting killed for doing too many binder calls in the background:
10-21 16:09:13.031 1452 2008 D ActivityManager: freezing 30385 com.github.tmo1.sms_ie.debug
10-21 16:09:13.031 1452 2008 D ActivityManager: Kill app due to repeated failure to freeze binder: 30385 com.github.tmo1.sms_ie.debug
10-21 16:09:13.032 1452 1749 I ActivityManager: Killing 30385:com.github.tmo1.sms_ie.debug/u0a312 (adj 900): excessive binder traffic during cached
10-21 16:09:13.041 1452 1767 I libprocessgroup: Successfully killed process cgroup uid 10312 pid 30385 in 5ms
This behavior was introduced in Android 14 in commit https://android.googlesource.com/platform/frameworks/base.git/+/71d75c09b9a06732a6edb4d1488d2aa3eb779e14%5E%21/. I'll dig around and see if there's a way this can be worked around.
Hi,
Thank you for reporting this, and thank you very much for taking the time and making the effort to figure out the problem and create a (well-documented!) workaround. You're clearly a more advanced Android developer than I am :)
I would like to understand the problem better, though, before merging a workaround like your PR: what, exactly, is the app doing wrong? I thought that doing high volume and long running database queries and file I/O is exactly the sort of thing that background work is designed for. What is the "excessive binder traffic" that Android is complaining about? Perhaps if I understood the problem better, I could redesign the scheduled export code to be more compliant with what Android expects.
Sure thing!
I don't think sms-ie is doing anything incorrectly. The problem is mainly in these two calls:
Each of those queries involves inter-process communication between sms-ie and Android's MmsProvider
component, which performs a binder transaction under the hood. While the second query could potentially optimized so that it's only done once (and mid
is filtered client side), I don't think the same optimization can be done for content://mms/$msgId/addr
because it's querying a different URI each time. There doesn't seem to be a content://mms/addr
for querying everything at once: https://cs.android.com/android/platform/superproject/main/+/main:packages/providers/TelephonyProvider/src/com/android/providers/telephony/MmsProvider.java;l=1163;drc=c89bafa3f22a9dce56adb9b7fab2f9149d18c456.
In my case, I have 5000 or so MMS messages (well, RCS), so it ends up performing 10000+ queries. It takes around 4 minutes to export and Android usually kills the process about 1 minute 15 seconds in (~3k queries).
I thought that doing high volume and long running database queries and file I/O is exactly the sort of thing that background work is designed for.
That's unfortunately no longer true in recent Android versions. There are special exceptions to this though, like downloading files, but in the general case, without using a foreground service, it is very likely that long-running (multiple minute) operations will be killed.
Ah, I think I understand - thanks for the detailed explanation.
Rant:
I find Android development really aggravating. Everything is constantly changing, and unlike in Linux, where (with apologies to Perl) easy things are generally easy and hard things are usually at least possible, in Android it feels like easy things are generally hard and lots of things turn out to be impossible.
The sheer amount of rules and restrictions imposed by the system just feels developer hostile, and while I understand that the purpose is to protect users and provide them with good user experiences, the result seems to be an unfortunate lack of availability of quality apps for those same users. E.g.: on Linux, there are innumerable backup applications, ranging from simple and easy to use to complex and powerful, from CLI to GUI, while (due to Android design decisions) there's literally not one decent generally usable backup app for normal non-rooted Android devices (which is in part why SMS I/E is needed in the first place).
To be fair, I'm just an amateur, hobbyist developer, and perhaps this is not an entirely fair and accurate assessment of the state of things on Android, but this is what the situation feels like to me
End rant.
Thank you again very much for your work on this, though. I'll do some testing, and if everything looks good, I'll merge the PR. (You mention not having a telephony-capable Android <= 13 device for testing - did you try testing on a pre-14 AVD?)
I 100% agree with your rant and feel the same way. Usable Linux phones can't come soon enough.
(You mention not having a telephony-capable Android <= 13 device for testing - did you try testing on a pre-14 AVD?)
I wasn't aware they had the telephony components. I'll give that a try and report back.
I tested on a handful of versions via AVD:
I found a crash when opening the settings screen and pushed a fix for it in my branch: https://github.com/tmo1/sms-ie/commit/c4aa39c7c1bab07ba93eef4519640598b2d7c303.
Thanks! I'm not sure why I didn't hit that crash when I tested the API 19 code, but I did reproduce it now, and confirmed that your fix fixes it.
App is likely to be killed when exporting many MMS messages via the background service. Runs reliably via the foreground service.
Thank you again very much for all the work, testing, and documenting you've done on this!
I've reviewed the code in your patch, and I think I understand what it's doing, but before I merge it, I'm trying to reproduce the crash you're seeing with the old code (and the patched code with battery optimizations disabled) and confirm that the fix works, but so far I haven't been able to reproduce the crash. I just did a scheduled export of 15400 MMS messages (on a Pixel 7 API 34 AVD) and it worked fine. Do you have any ideas as to how I can trigger the crash?
No problem at all!
Do you have any ideas as to how I can trigger the crash?
I forgot to mention this in my last post. I can reproduce the crash reliably, even after a factory reset, on a Pixel 8 Pro nearly every time, but for whatever reason, never in the AVD. In both cases, I restored from the same backup (~5000 MMS messages) and triggered a scheduled export.
I'm not sure if Android's CachedAppOptimizer
is disabled in the emulator or if there's something that prevents it from running and killing apps. I know certain actions will disable it temporarily, like adb shell pm dump <package>
. There's a CachedAppOptimizer.dump()
function that would print out a lot of helpful statistics about what's going on, but I can't find a good way to call it without doing something hacky with root access.
Do you have any ideas as to how I can trigger the crash?
I realized that when I tested, I left the app in the foreground, so that's likely why the killing / freezing behavior did not occur :|
I did a bunch more tests with the app removed from the foreground, on API 33 and 34 (all on AVDs), and I observed the following:
I never encountered the Kill app due to repeated failure to freeze binder
and Killing ... : excessive binder traffic during cached
messages that you encountered - perhaps, as you suggested, the emulator behaves differently from physical devices in this regard?
Instead, in my testing, when the scheduled export ran as a background service, it was often / usually killed by ActivityManager
with a different message, along the lines of: Killing 12531:com.github.tmo1.sms_ie/u0a191 (adj 700): excessive cpu 169000 during 300075 dur=309031 limit=25
. This occurred with both API 33 and 34. At least some of the time, this occurred after about ten minutes, so I assume that this is due to Android's limitations on background services not specified as "long running" - I saw this and this, although I can't find precise documentation of Android's service killing rules.
Your foreground service patch seemed to work consistently in my testing as well, although I haven't tested extensively. When running as a foreground service, I observed, after the first ten minutes and every five minutes or so after that, log entries like Letting 82e10f7 #u0a191/36 com.github.tmo1.sms_ie/androidx.work.impl.background.systemjob.SystemJobService continue to run past min execution time
, which supports the hypothesis that without foreground execution, Android is killing the service even in the absence of binder issues.
So it looks like your patch is definitely necessary for the reliable running of large scheduled exports, whether to avoid the binder issue or the "excessive cpu" / long-running service problem, but I wish I had a better understanding of all this.
Another, likely unrelated question: when doing all this testing, I noticed that the notifications that the app is supposed to show upon completion of a scheduled export run are not being displayed (in 33 or 34). This used to work, and I'm not sure when / why it no longer does. Did you notice the completion notifications in your testing?
Thanks for the additional testing!
Another, likely unrelated question: when doing all this testing, I noticed that the notifications that the app is supposed to show upon completion of a scheduled export run are not being displayed (in 33 or 34). This used to work, and I'm not sure when / why it no longer does. Did you notice the completion notifications in your testing?
This is because sms-ie currently doesn't use the POST_NOTIFICATIONS
permission, which must be declared in the manifest and also requested at runtime as of API 33. Maybe sms-ie could prompt for that permission when enabling scheduled exports?
Maybe sms-ie could prompt for that permission when enabling scheduled exports?
Sounds right - I'll try it.
Maybe sms-ie could prompt for that permission when enabling scheduled exports?
Done - thanks!
On Android 14 (Google Pixel stock firmware), the scheduled exports seem to be crashing, even though exports done manually via the UI work fine. When
ExportWorker
tries it run, it fails during the message export phase with a binder error:Manually triggering with scheduled job with the following command also results in the same crash:
When the crash occurs, it leaves behind a partially written zip file with just the local zip entry header for
messages.ndjson
:I'm not sure why this is happening yet. I'll see if I can reproduce the problem this weekend with a debug build to get a better stack trace. (The logs above are from the precompiled 2.2.0 APK from the Github release page.)