Open honggyukim opened 1 year ago
The backtrace from gdb is as follows.
$ gdb -q --args uftrace record async_exit.py
Reading symbols from uftrace...
(gdb) r
...
munmap_chunk(): invalid pointer
Thread 2.1 "python3" received signal SIGABRT, Aborted.
[Switching to Thread 0x7ffff7ccf840 (LWP 24517)]
(gdb) bt
#0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=140737350793280) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=6, threadid=140737350793280) at ./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=140737350793280, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3 0x00007ffff7a42476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4 0x00007ffff7a287f3 in __GI_abort () at ./stdlib/abort.c:79
#5 0x00007ffff7a896f6 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7ffff7bdbb8c "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#6 0x00007ffff7aa0d7c in malloc_printerr (str=str@entry=0x7ffff7bde230 "munmap_chunk(): invalid pointer") at ./malloc/malloc.c:5664
#7 0x00007ffff7aa105c in munmap_chunk (p=<optimized out>) at ./malloc/malloc.c:3060
#8 0x00007ffff7aa551a in __GI___libc_free (mem=<optimized out>) at ./malloc/malloc.c:3381
#9 0x00007ffff65aca00 in OPENSSL_LH_free () from /lib/x86_64-linux-gnu/libcrypto.so.3
#10 0x00007ffff65bfa5f in OPENSSL_cleanup () from /lib/x86_64-linux-gnu/libcrypto.so.3
#11 0x00007ffff7a45495 in __run_exit_handlers (status=0, listp=0x7ffff7c19838 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at ./stdlib/exit.c:113
#12 0x00007ffff7a45610 in __GI_exit (status=<optimized out>) at ./stdlib/exit.c:143
#13 0x00007ffff7a29d97 in __libc_start_call_main (main=main@entry=0x55555577eb90, argc=argc@entry=4, argv=argv@entry=0x7fffffffdac8) at ../sysdeps/nptl/libc_start_call_main.h:74
#14 0x00007ffff7a29e40 in __libc_start_main_impl (main=0x55555577eb90, argc=4, argv=0x7fffffffdac8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffdab8) at ../csu/libc-start.c:392
#15 0x000055555577eac5 in _start ()
I see that this problem disappears if 953cef1fd197c37c5d2d103361fe94cb9cea857d, which is from #1725, is reverted.
Hmm.. it might be related some unknown issue. The 953cef1 might not be a real problem. The problem disappears when changing the uftrace execution directory is changed.
Can anyone reproduce this problem? Otherwise, it might just be a tricky environment issue in my machine.
This issue occurred in a Python venv environment with the problem https://github.com/namhyung/uftrace/pull/1725#issuecomment-1577978923.
(venv4) $ uftrace async_exit.py
munmap_chunk(): invalid pointer
WARN: child terminated by signal: 6: Aborted
# DURATION TID FUNCTION
[ 9733] | <1>() {
112.302 ms [ 9733] | <2>();
0.625 us [ 9733] | <3540>();
112.315 ms [ 9733] | } /* <1> */
It could be a Python package dependency issue. Here is the list of Python packages for the venv.
#requirements.txt
altgraph==0.17.3
contourpy==1.1.0
cycler==0.11.0
fonttools==4.40.0
kiwisolver==1.4.4
matplotlib==3.7.1
numpy==1.24.3
packaging==23.1
pandas==2.0.2
Pillow==9.5.0
pyinstaller==5.13.0
pyinstaller-hooks-contrib==2023.5
pyparsing==3.0.9
python-dateutil==2.8.2
pytz==2023.3
six==1.16.0
tzdata==2023.3
I can reproduce the problem with the commands below.
$ python3 -m venv venv
$ source venv/bin/activate
$ pip3 install -r requirements.txt
$ uftrace async_exit.py
Other Python environments do not have this issue.
(venv3) $ uftrace async_exit.py
# DURATION TID FUNCTION
[ 7157] | __main__.<module>() {
116.136 ms [ 7157] | importlib._bootstrap._find_and_load();
0.513 us [ 7157] | sys.exit();
116.153 ms [ 7157] | } /* __main__.<module> */
@MichelleJin12 Thanks a lot for the analysis. I found that installing matplotlib
makes the problem. The following requirements.txt
installation doesn't show the problem.
altgraph==0.17.3
contourpy==1.1.0
cycler==0.11.0
fonttools==4.40.0
kiwisolver==1.4.4
#matplotlib==3.7.1
numpy==1.24.3
packaging==23.1
pandas==2.0.2
Pillow==9.5.0
pyinstaller==5.13.0
pyinstaller-hooks-contrib==2023.5
pyparsing==3.0.9
python-dateutil==2.8.2
pytz==2023.3
six==1.16.0
tzdata==2023.3
The following requirements.txt
file makes the problem. We need to investigate why it makes the problem.
matplotlib==3.7.1
My test environment looks like this. (An ARM version of Ubuntu running as a virtual machine on a MacBook M2)
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.2 LTS
Release: 22.04
Codename: jammy
$ uname -a
Linux ubuntu-test 5.15.0-76-generic #83-Ubuntu SMP Thu Jun 15 19:21:56 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux
Here's a list of the Python venv packages in my environment.
$ pip3 list
Package Version
--------------- -------
contourpy 1.1.0
cycler 0.11.0
fonttools 4.41.1
kiwisolver 1.4.4
matplotlib 3.7.1
numpy 1.24.3
packaging 23.1
Pillow 9.5.0
pip 22.0.2
pyparsing 3.0.9
python-dateutil 2.8.2
setuptools 59.6.0
six 1.16.0
In Ubuntu 22.04 Arm version, even if matplotlib
is installed, it worked fine as below.
$ uftrace asyncio_exit.py
# DURATION TID FUNCTION
[ 21468] | __main__.<module>() {
61.808 ms [ 21468] | importlib._bootstrap._find_and_load();
0.292 us [ 21468] | sys.exit();
61.816 ms [ 21468] | } /* __main__.<module> */
However I was able to reproduce the same problem as above when matplotlib
was installed on Ubuntu in an x86 environment.
I approached it thinking that the problem was caused by a different version of OpenSSL being called by Python, but I found that OpenSSL 3.0.2 15 March 2022
was the same.
Can anyone reproduce the same results as me on aarch64
arch?
$lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.2 LTS
Release: 22.04
Codename: jammy
$ uname -a
Linux rusalka 5.15.0-76-generic #83-Ubuntu SMP Thu Jun 15 19:21:56 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux
$ pip3 list
Package Version
------------------------- -------
altgraph 0.17.3
contourpy 1.1.0
cycler 0.11.0
fonttools 4.40.0
kiwisolver 1.4.4
numpy 1.24.3
packaging 23.1
pandas 2.0.2
Pillow 9.5.0
pip 22.0.2
pyinstaller 5.13.0
pyinstaller-hooks-contrib 2023.5
pyparsing 3.0.9
python-dateutil 2.8.2
pytz 2023.3
setuptools 59.6.0
six 1.16.0
tzdata 2023.3
$ uftrace /home/winpsho/uftrace/tests/async_exit.py
# DURATION TID FUNCTION
[296717] | __main__.<module>() {
85.880 ms [296717] | importlib._bootstrap._find_and_load();
0.625 us [296717] | sys.exit();
85.892 ms [296717] | } /* __main__.<module> */
I got same result. It works without problems.
Now, I see that this problem is fixed by a9e878277b4025a7107fdee8885aa30142923730.
$ uftrace ./async_exit.py
WARN: unpaired cygprof exit: dropping...
# DURATION TID FUNCTION
[ 41753] | __main__.<module>() {
102.347 ms [ 41753] | importlib._bootstrap._find_and_load();
0.794 us [ 41753] | sys.exit();
102.361 ms [ 41753] | } /* __main__.<module> */
45.775 us [ 41753] | threading._shutdown();
50.603 us [ 41753] | logging.shutdown();
3.614 us [ 41753] | _weakrefset._remove();
I'm not sure if we really have to print WARN: unpaired cygprof exit: dropping...
message though.
@namhyung Do you think this is enough fix for this issue?
Hmm.. I thought it'd be rare. If not, we might change it to pr_dbg()
.
I see it quite often when tracing python these days. I think better to change it to pr_dbg.
The example program is as follows.
It gets aborted when tracing with uftrace as follows.
But there is no problem calling
sys.exit
without importingasyncio
.