openwrt / luci

LuCI - OpenWrt Configuration Interface
Apache License 2.0
6.38k stars 2.53k forks source link

rpcd segfaults #7335

Open systemcrash opened 3 weeks ago

systemcrash commented 3 weeks ago

ping @jow-

Segfaults and hangs. Sometimes.

ucode ddns.uc thinks the code is fine.

hang code segfault code

But when put the file in /usr/share/rpcd/ucode and I do:

service rpcd restart
ubus call ddns get_env # runs OK
ubus call ddns get_services_log '{"service_name": "myddns_ipv4"}' # runs OK
ubus call ddns get_ddns_state # problem - hangs on ARM, segfaults on X86_64
ubus call ddns get_services_status # problem - hangs on ARM, segfaults on x86_64

for the problem cases, rpcd comes back with Command failed: Request timed out, and maybe after repeated attempts on ARM, the problem cases complete. No hang.

Segfault examples from x86_64

[   36.502180] rpcd[1228]: segfault at 0 ip 0000000000000000 sp 00007ffd2a0291b8 error 14 in rpcd[56470ffc7000+4000]
[   36.545021] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
[  281.204779] rpcd[4762]: segfault at 0 ip 0000000000000000 sp 00007fff36752fd8 error 14 in rpcd[5565ddc07000+4000]
[  281.247614] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
[  546.931074] rpcd[5453]: segfault at 7f00d7117c80 ip 00007ff6d7177213 sp 00007ffea93640e8 error 4 in libucode.so.20230711[7ff6d7164000+18000]
[  547.004116] Code: 8b 05 01 ed 00 00 48 8d 3d 0a 4e 00 00 48 8b 30 ff 15 b1 ed 00 00 ff 15 0b eb 00 00 5a c3 89 f8 83 e0 03 75 0a 48 85 ff 74 05 <8a> 07 83 e0 0f c3 55 48 89 f5 53 48 89 fb 51 67 e8 df ff ff ff 83
[ 1073.937643] rpcd[6715]: segfault at 7f6acd585068 ip 00007f6acd6a58e8 sp 00007ffe46e82390 error 4 in libjson-c.so.5.2.0[7f6acd69f000+8000]
[ 1073.996056] Code: 85 c0 75 46 41 8b 0c 24 48 83 c3 01 39 d9 49 0f 44 de 83 c5 01 39 e9 7e 1b 49 8b 54 24 18 48 8d 04 9b 4c 8d 2c c5 00 00 00 00 <48> 8b 3c c2 48 83 ff ff 75 be 48 83 c4 08 31 c0 5b 5d 41 5c 41 5d

Seems like resource exhaustion issues. Problems seem to centre around the popen call to date.

Weird thing? If I start with

ubus call ddns get_services_log '{"service_name": "myddns_ipv4"}'

Then subsequent calls don't seem to lock or hang so soon. But ARM does this if it doesn't hang:

ubus call ddns get_ddns_state
{
    "_version": "2.8.2-43",
    "_enabled": true,
    "_curr_dateformat": null, # this field differs
    "_services_list": "NO_LIST"
}

While x86_64 does this:

ubus call ddns get_ddns_state
{
    "_version": "2.8.2-43",
    "_enabled": true,
    "_curr_dateformat": "2024-10-19 01:00\n",
    "_services_list": "NO_LIST"
}

ARM:

NAME="OpenWrt"
VERSION="23.05.5"
ID="openwrt"
ID_LIKE="lede openwrt"
PRETTY_NAME="OpenWrt 23.05.5"
VERSION_ID="23.05.5"
HOME_URL="https://openwrt.org/"
BUG_URL="https://bugs.openwrt.org/"
SUPPORT_URL="https://forum.openwrt.org/"
BUILD_ID="r24106-10cc5fcd00"
OPENWRT_BOARD="mediatek/filogic"
OPENWRT_ARCH="aarch64_cortex-a53"
OPENWRT_TAINTS=""
OPENWRT_DEVICE_MANUFACTURER="OpenWrt"
OPENWRT_DEVICE_MANUFACTURER_URL="https://openwrt.org/"
OPENWRT_DEVICE_PRODUCT="Generic"
OPENWRT_DEVICE_REVISION="v0"
OPENWRT_RELEASE="OpenWrt 23.05.5 r24106-10cc5fcd00"

x86_64

NAME="OpenWrt"
VERSION="23.05.5"
ID="openwrt"
ID_LIKE="lede openwrt"
PRETTY_NAME="OpenWrt 23.05.5"
VERSION_ID="23.05.5"
HOME_URL="https://openwrt.org/"
BUG_URL="https://bugs.openwrt.org/"
SUPPORT_URL="https://forum.openwrt.org/"
BUILD_ID="r24106-10cc5fcd00"
OPENWRT_BOARD="x86/64"
OPENWRT_ARCH="x86_64"
OPENWRT_TAINTS=""
OPENWRT_DEVICE_MANUFACTURER="OpenWrt"
OPENWRT_DEVICE_MANUFACTURER_URL="https://openwrt.org/"
OPENWRT_DEVICE_PRODUCT="Generic"
OPENWRT_DEVICE_REVISION="v0"
OPENWRT_RELEASE="OpenWrt 23.05.5 r24106-10cc5fcd00"

Addendum: Copying the files to my devices using:

scp -O applications/luci-app-ddns/root/usr/share/rpcd/ucode/ddns.uc root@192.168.1.1:/usr/share/rpcd/ucode

reliably triggered the problem. I've noticed somewhere else that scp -O causes problems with large transfers.

But once I installed openssh-sftp-server the problems abated on x86_64.

systemcrash commented 3 weeks ago

BTW, is this relevant and up to date? https://ucode.mein.io/module-core.html Some things here seem to be missing.

jow- commented 2 weeks ago

Couldn't reproduce this during some cursory testing with a qemu x86/64 vm

systemcrash commented 2 weeks ago

Try running the it multiple times. Sometimes things worked out or me. But after a few repeats, it eventually happens.

jow- commented 2 weeks ago

Did a root@OpenWrt:~# while true; do ubus call ddns get_ddns_state ; done running fine for ~3 minutes

systemcrash commented 2 weeks ago

Madness. Do you run it on 23.05 or master?