termux / termux-api-package

Termux package containing scripts to call functionality in Termux:API.
MIT License
994 stars 314 forks source link

Background jobs: Occasional hang #166

Open Terrance opened 1 year ago

Terrance commented 1 year ago

Problem description

I have a number of scripts set to run via daily or twice-daily scheduled jobs, and at the end they send a notification if they've done any work. Sometimes this final notification step hangs (with the task remaining running until I manually go and kill it).

Steps to reproduce

This isn't consistently reproducible (happens maybe once a week/fortnight), but may be down to two scripts trying to send notifications simultaneously? I've tried the likes of termux-notification -c A & termux-notification -c B but can't immediately reproduce it.

They each end with something like:

[ $? -eq 0 ] && termux-notification -c "Backup complete"

Additional information

$ termux-info
Termux Variables:
TERMUX_APP_PACKAGE_MANAGER=apt
TERMUX_MAIN_PACKAGE_FORMAT=debian
TERMUX_VERSION=0.118.0
Packages CPU architecture:
aarch64
Subscribed repositories:
# sources.list
deb https://mirror.accum.se/mirror/termux.dev/termux-main stable main
Updatable packages:
All packages up to date
termux-tools version:
1.32.0
Android version:
12
Kernel build information:
Linux localhost 4.19.157-perf+ #1 SMP PREEMPT Sat Sep 17 00:13:28 CST 2022 aarch64 Android
Device manufacturer:
OnePlus
Device model:
EB2103

I'll try and grab a logcat next time it happens.

bemau commented 1 year ago

getting the same error, here is my logcat - have you been able to find a solution for the problem?

04-09 22:54:05.621 26230 26230 D AndroidRuntime: >>>>>> START com.android.internal.os.RuntimeInit uid 10003 <<<<<<
04-09 22:54:05.631 26230 26230 I AndroidRuntime: Using default boot image
04-09 22:54:05.631 26230 26230 I AndroidRuntime: Leaving lock profiling enabled
04-09 22:54:05.631 26230 26230 W libc    : Access denied finding property "odsign.verification.success"
04-09 22:54:05.632 26230 26230 D SecNativeFeatureCppIf: encoded
04-09 22:54:05.632 26230 26230 D SecNativeFeatureCppIf: decoding and unzip finish
04-09 22:54:05.632 26230 26230 D SecNativeFeatureCppIf: use unCompDataBuff. encoded
04-09 22:54:05.633 26230 26230 D SecNativeFeatureCppIf: Free unCompDataBuff
04-09 22:54:05.633 26230 26230 D AndroidRuntime: addProductProperty: pBrand1 is not null
04-09 22:54:05.633 26230 26230 D AndroidRuntime: addProductProperty: not brand or 7
04-09 22:54:05.640 26230 26230 W app_process: ART APEX data files are untrusted.
04-09 22:54:05.641 26230 26230 I app_process: Using CollectorTypeCC GC.
04-09 22:54:05.884 26230 26230 D app_process: Time zone APEX ICU file found: /apex/com.android.tzdata/etc/icu/icu_tzdata.dat
04-09 22:54:05.884 26230 26230 D app_process: I18n APEX ICU file found: /apex/com.android.i18n/etc/icu/icudt70l.dat
04-09 22:54:05.905 26230 26230 W ziparchive: Unable to open '/data/data/com.termux/files/usr/libexec/termux-am/am.dm': No such file or directory
04-09 22:54:05.945 26230 26230 E SemDvfsManager_JNI: SemDvfsManager: registerfunction enter
04-09 22:54:05.946 26230 26230 E SemAffinityControl: SemAffinityControl: registerfunction enter
04-09 22:54:05.963 26230 26230 D AndroidRuntime: Calling main entry com.termux.termuxam.Am
04-09 22:54:05.995 26230 26230 D AndroidRuntime: Shutting down VM
Terrance commented 1 year ago

Still happening for me -- unfortunately I don't immediately notice it happening, only later when I spot the Termux notification with lingering tasks. I'm not sure this is limited to termux-notification, as I've caught other processes in similarly stuck states when launched from a background job:

$ ps faux
u0_a291  12043  0.0  0.0 2124908 2116 ?        S     1970   0:00 sh /data/data/com.termux/files/home/.shortcuts/tasks/jobs/one.sh
u0_a291  12147  0.0  0.0 2240508 2148 ?        S     1970   0:00  \_ ssh ...
u0_a291  11985  0.0  0.0 2124908 1952 ?        S     1970   0:00 sh /data/data/com.termux/files/home/.shortcuts/tasks/jobs/two.sh
u0_a291  11992  0.0  0.0 2260232 2124 ?        S     1970   0:00  \_ python3 bin/two.py ...
u0_a291   9492  0.0  0.0 2176108 1944 ?        S     1970   0:00 sh /data/data/com.termux/files/home/.shortcuts/tasks/jobs/two.sh
u0_a291   9509  0.0  0.0 2301192 1976 ?        S     1970   0:00  \_ python3 bin/two.py ...
u0_a291  28661  0.0  0.0 2124908 1720 ?        S     1970   0:23 sh /data/data/com.termux/files/home/.shortcuts/tasks/jobs/three.sh
u0_a291  28782  0.0  0.0 2191044 1948 ?        S     1970   0:18  \_ /data/data/com.termux/files/usr/bin/bash /data/data/com.termux/files/usr/bin/termux-notification -c ...
u0_a291  28786  0.0  0.0 2124824 1852 ?        S     1970   0:12      \_ /data/data/com.termux/files/usr/libexec/termux-api Notification
u0_a291  28787  0.0  0.2 5105436 26228 ?       Sl    1970   0:08          \_ /system/bin/app_process / com.termux.termuxam.Am broadcast --user 0 -n com.termux.api/.TermuxApiReceiver --es socket_input 76836bdbbecb0a4d-7072-4e99-b8bd-71b3de388aef9baa190d59a3 --es socket_output 5a019ab33c55ae4f-7072-499e-bf8c-ef3d2c14b9a19f792c8f53ac --es api_method Notification
u0_a291  22647  0.0  0.0 2149484 1852 ?        S     1970   1:07 sh /data/data/com.termux/files/home/.shortcuts/tasks/jobs/one.sh
u0_a291  22755  0.0  0.0 2213572 1928 ?        S     1970   1:03  \_ /data/data/com.termux/files/usr/bin/bash /data/data/com.termux/files/usr/bin/termux-notification -c ...
u0_a291  22759  0.0  0.0 2124824 1764 ?        S     1970   0:32      \_ /data/data/com.termux/files/usr/libexec/termux-api Notification
u0_a291  22760  0.0  0.1 5185308 23048 ?       Sl    1970   0:19          \_ /system/bin/app_process / com.termux.termuxam.Am broadcast --user 0 -n com.termux.api/.TermuxApiReceiver --es socket_input ddfdf7e62fc859cb-58e7-4a0a-b8d8-f20a8d8a7e62993fcad675c8 --es socket_output 30a075b0c2778458-58e7-45de-bb3e-4c6a4d34fda455141c41feaf --es api_method Notification
u0_a291  12394  0.0  0.0 2133048 2972 ?        S<s   1970   0:00 sshd
u0_a291  12480  0.3  0.0 2194496 7772 ?        S<s   1970   0:00  \_ /data/data/com.termux/files/usr/bin/sshd -R
u0_a291  12484  0.4  0.0 2170584 6132 pts/0    S<s   1970   0:00      \_ /data/data/com.termux/files/usr/bin/bash -l
u0_a291  13195  2.0  0.0 2125548 2740 pts/0    R<+   1970   0:00          \_ ps faux

$ strace -p 12147 # ssh
strace: Process 12147 attached
ppoll([{fd=3, events=POLLIN}], 1, ...
<left attached, about 4 minutes later>
ppoll([{fd=3, events=POLLIN}], 1, NULL, NULL, 8) = 1 ([{fd=3, revents=POLLIN|POLLERR|POLLHUP}])
read(3, 0x7fee524018, 8192)             = -1 ECONNRESET (Connection reset by peer)
write(2, "Connection reset by ..."..., 43) = 43
exit_group(255)                         = ?
+++ exited with 255 +++

$ strace -p 11992 # python
strace: Process 11992 attached
read(3,

$ strace -p 28787 # app_process / com.termux.termuxam.Am
strace: Process 28787 attached
futex(0xb40000721fd85970, FUTEX_WAIT_PRIVATE, 0, NULL

The latter traces remain stuck after about 10 minutes, though the processes have likely been running in the order of hours -- these processes eventually drop off, not sure if they eventually become unwedged or they're being reaped by the system.