Closed hexiaowen closed 3 years ago
There's a strange phenomenon here.
In frame 11, nulstr=0xffff877f93d0 "",
But in frame 12,
x/32bs (uint8_t*) &buf.raw[bufpos]
0xffff877f9360: "ACTION"
0xffff877f9367: "change"
0xffff877f936e: "DEVPATH"
0xffff877f9376: "/devices/virtual/block/dm-69"
0xffff877f9393: "SUBSYSTEM"
0xffff877f939d: "block"
0xffff877f93a3: "DM_COOKIE"
0xffff877f93ad: "23068672"
0xffff877f93b6: "DEVNAME"
0xffff877f93be: "/dev/dm-69"
0xffff877f93c9: "DEVTYPE"
0xffff877f93d1: "disk"
0xffff877f93d6: "SEQNUM"
0xffff877f93dd: "14437"
0xffff877f93e3: "USEC_INITIALIZED"
0xffff877f93f4: "8213096220"
0xffff877f93ff: "MAJOR"
0xffff877f9405: "253"
0xffff877f9409: "MINOR"
0xffff877f940f: "69"
0xffff877f9412: "DM_UDEV_DISABLE_LIBRARY_FALLBACK_FLAG"
0xffff877f9438: "1"
0xffff877f943a: "DM_UDEV_PRIMARY_SOURCE_FLAG"
0xffff877f9456: "1"
0xffff877f9458: "DM_SUBSYSTEM_UDEV_FLAG0"
0xffff877f9470: "1"
0xffff877f9472: "DM_ACTIVATION"
0xffff877f9480: "0"
0xffff877f9482: "DM_NAME"
0xffff877f948a: "36e02861100592fcc99ad3c3800000195"
0xffff877f94ac: "DM_UUID"
0xffff877f94b4: "mpath-36e02861100592fcc99ad3c3800000195"
As noted on dm-devel, could you check if it helps to disable pthread_cancel()
while calling udev_monitor_receive_device()
?
I don't think libudev is generally safe to be used in multithreaded programs. We're not aware of any issues, but this might be one.
It is helpful to disable pthread_cancel() while calling udev_monitor_receive_device(). Please privide a patch. Thanks.
This is a major change in multipath-tools, and can't be rushed. I've been sick lately and not been able to work on it. Please explore if you can't fix the issue in OpenEuler by just not using -fexceptions for libudev and libsystemd.
This is fixed by not using -fexceptions. Thanks!
FTR, there was anothre issue, fixed with https://github.com/openSUSE/multipath-tools/commit/38ffd890aaeace8a6909f5685d3394e8cfe3b975 from https://github.com/openSUSE/multipath-tools/tree/queue.
I believe this issue can be closed.
@cvaroqui, would you mind closing this issue?
(gdb) bt
0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
1 0x0000ffff87d9e81c in __GI_abort () at abort.c:79
2 0x0000ffff87dd7818 in __libc_message (action=action@entry=do_abort,
3 0x0000ffff87dddf6c in malloc_printerr (
4 0x0000ffff87ddf780 in _int_free (av=0xffff87ed7a58, p=0xffff80000070,
5 0x0000ffff880f55a8 in internal_hashmap_clear (h=h@entry=0xffff80027980,
6 0x0000ffff880f56a0 in internal_hashmap_free (h=,
7 0x0000ffff880f582c in ordered_hashmap_free_free_free () at ../src/basic/hashmap.h:118
8 device_free (device=0xffff80027820) at ../src/libsystemd/sd-device/sd-device.c:68
9 sd_device_unref (p=) at ../src/libsystemd/sd-device/sd-device.c:78
10 0x0000ffff88100978 in sd_device_unrefp () at ../src/systemd/sd-device.h:118
11 device_new_from_nulstr (len=, nulstr=0xffff877f93d0 "",
12 device_monitor_receive_device (m=0xffff80000b20, ret=ret@entry=0xffff877fb388)
13 0x0000ffff881028a4 in udev_monitor_receive_sd_device (ret=0xffff877fb388,
14 udev_monitor_receive_device (udev_monitor=0xffff80000c70,
15 0x0000ffff881a3478 in uevent_listen (udev=0xffff877fbf40) at uevent.c:853
16 0x0000aaaadc524514 in ueventloop (ap=0xffffc4134bd0) at main.c:1518
17 0x0000ffff880827ac in start_thread (arg=0xffff8821e380) at pthread_create.c:486
18 0x0000ffff87e3c47c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78
Multipathd has produced almost the same call stack twice. The udev API is suspected at first. However, hashmap is a common data structure of systemd. Systemd has never had the same call stack, Can someone help me?
In the test case, run the kill -9 multipathd command repeatedly and then restart the system. Check whether the function is normal.