docker-library / postgres

Docker Official Image packaging for Postgres
http://www.postgresql.org
MIT License
2.2k stars 1.14k forks source link

Reproducible SIGSEGV in query #1233

Open joachimhb opened 6 months ago

joachimhb commented 6 months ago

postgres 16.2. debian bookworm ARM64 (AWS, t4g.xlarge, Graviton2, 64bit) latest docker image 16.2 from https://hub.docker.com/_/postgres

on a specific query there is a SIGSEGV, resulting in two core files in /var/lib/postgresql/data

-rw------- 1 postgres postgres 6731063296 Apr 30 13:09 core.32
-rw------- 1 postgres postgres 6713049088 Apr 30 13:09 core.33 

installed symbols

find-dbgsym-packages /usr/lib/postgresql/16/bin/postgres
sudo apt install postgresql-16-dbgsym

then tried to get a stacktrace:

gdb -q /usr/lib/postgresql/16/bin/postgres ./core.33

[Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: parallel worker for PID
32                                          '.
Program terminated with signal SIGSEGV, Segmentation fault.
--Type <RET> for more, q to quit, c to continue without paging--
#0  0x0000fffdf6f1d708 in ?? ()
(gdb) bt full
#0  0x0000fffdf6f1d708 in ?? ()
No symbol table info available.
#1  0x0000fffdf96a6338 in __gthread_mutex_lock ()
     at
/usr/lib/gcc/aarch64-linux-gnu/12/../../../../include/aarch64-linux-gnu/c++/12/bits/gthr-default.h:749
No locals.
#2  lock () at
/usr/lib/gcc/aarch64-linux-gnu/12/../../../../include/c++/12/bits/std_mutex.h:100
No locals.
#3  lock_guard () at
/usr/lib/gcc/aarch64-linux-gnu/12/../../../../include/c++/12/bits/std_mutex.h:229
No locals.
#4  remove_fatal_error_handler ()
     at
build-llvm/tools/clang/stage2-bins/llvm/lib/Support/ErrorHandling.cpp:76
No locals.
#5  0x0000fffdf7114918 in ?? ()
No symbol table info available.
Backtrace stopped: previous frame inner to this frame (corrupt stack?) 

and

gdb -q /usr/lib/postgresql/16/bin/postgres ./core.32

[Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: sa bi 10.14.100.230(55185) SELECT                                   '.
Program terminated with signal SIGSEGV, Segmentation fault.
--Type <RET> for more, q to quit, c to continue without paging--
#0  0x0000fffdf59aa708 in ?? ()
(gdb) bt full
#0  0x0000fffdf59aa708 in ?? ()
No symbol table info available.
#1  0x0000fffdf8736338 in __gthread_mutex_lock ()
     at
/usr/lib/gcc/aarch64-linux-gnu/12/../../../../include/aarch64-linux-gnu/c++/12/bits/gthr-default.h:749
No locals.
#2  lock () at
/usr/lib/gcc/aarch64-linux-gnu/12/../../../../include/c++/12/bits/std_mutex.h:100
No locals.
#3  lock_guard () at
/usr/lib/gcc/aarch64-linux-gnu/12/../../../../include/c++/12/bits/std_mutex.h:229
No locals.
#4  remove_fatal_error_handler ()
     at
build-llvm/tools/clang/stage2-bins/llvm/lib/Support/ErrorHandling.cpp:76
No locals.
#5  0x0000fffdfe8e8e28 in ?? ()
No symbol table info available.
Backtrace stopped: previous frame inner to this frame (corrupt stack?) 

please let me know if there is something useful in this data to analyze, or if you want me to execute other steps to get more useful data. the issue can be reproduced on demand.

tianon commented 6 months ago

Our packages here are stock installed packages from upstream, so I think https://wiki.postgresql.org/wiki/Apt is probably the best place to chase information about what's happening here. :sweat_smile:

From that page, my best guess would be one of: