Closed dadamssg closed 2 years ago
When FPM spawns a PHP worker process, PHP invokes the MINIT function of each enabled extension. For ext-mongodb
, PHP_MINIT_FUNCTION(mongodb)
will be called. If we step through that and ignore the PHP-specific stuff (e.g. registering classes), we see two notable actions:
bson_mem_set_vtable
to instruct libbson and libmongoc to use PHP's persistent memory allocation functionsmongoc_init
to initialize libmongoc and its dependenciesDiving into libmongoc, mongoc_init
delegates to _mongoc_do_init
with logic to ensure that function is only called once per process. When compiled against OpenSSL, _mongoc_do_init
will call _mongoc_openssl_init
to initialize OpenSSL. I'll stop there for now, but let's remember that function as a future point of investigation for the various OpenSSL function calls that may conflict with whatever else is happening in your environment.
The debug log you provided is unfortunately not helpful here, as it's only reporting tracing from some of libmongoc's own function calls. Since you're not interacting with the driver in this request, the debug log is only reporting some functions that get called from mongoc_init
. In this case, we see some trace logs from functions in mongoc-linux-distro-scanner.c, which is called via mongoc-handshake.c (when _mongoc_do_init
invokes _mongoc_handshake_init
).
To investigate this further, I'll need a GDB backtrace of the segfault.
In my debugging, i wrote a simple test script that generates a JWT token using the same library.
What does "same library" refer to in this case? I assume the "library" is a PHP package that interacts directly with ext-openssl
, but correct me if I'm mistaken.
If you're able to share that "simple test script" that also produces a segfault when run through FPM, that'd be most helpful. Sharing composer.json
and composer.lock
(if applicable) might also help.
I develop in docker and i'm unable to reproduce the issue in my development container however i can still reproduce it in my actual servers.
What are the differences in software versions between your production environment and Docker container?
Wow, thank you for the quick response.
The "same library" is referring to firebase's jwt package: firebase/php-jwt@v5.2.1
Link.
The "simple test script" just used the above package to echo out an encoded token. I ran that script from the command line and it ran fine so i don't think providing that will be helpful. It was essentially the example provided in the readme.
Here are some versions on a dev server that i can reproduce the problem:
[root@server mongodebug]# php --version
PHP 7.3.33 (cli) (built: Nov 16 2021 11:18:28) ( NTS )
Copyright (c) 1997-2018 The PHP Group
Zend Engine v3.3.33, Copyright (c) 1998-2018 Zend Technologies
with Zend OPcache v7.3.33, Copyright (c) 1999-2018, by Zend Technologies
[root@servermongodebug]# openssl version
OpenSSL 1.0.2k-fips 26 Jan 2017
[root@server mongodebug]# pecl list
Installed packages, channel pecl.php.net:
=========================================
Package Version State
apcu 5.1.21 stable
igbinary 3.2.7 stable
imagick 3.4.4 stable
mcrypt 1.0.4 stable
mongodb 1.12.0 stable
pdo_sqlsrv 5.9.0 stable
redis 4.3.0 stable
sqlsrv 5.9.0 stable
zip 1.20.0 stable
Here are the same versions of my container that i am unable to reproduce the issue:
[root@217275d42dfc /]# php -v
PHP 7.3.28 (cli) (built: Apr 27 2021 13:57:06) ( NTS )
Copyright (c) 1997-2018 The PHP Group
Zend Engine v3.3.28, Copyright (c) 1998-2018 Zend Technologies
with Zend OPcache v7.3.28, Copyright (c) 1999-2018, by Zend Technologies
with Xdebug v3.0.4, Copyright (c) 2002-2021, by Derick Rethans
[root@217275d42dfc /]# openssl version
OpenSSL 1.0.2k-fips 26 Jan 2017
[root@217275d42dfc /]# pecl list
Installed packages, channel pecl.php.net:
=========================================
Package Version State
imagick 3.4.4 stable
mcrypt 1.0.4 stable
mongodb 1.9.1 stable
pdo_sqlsrv 5.9.0 stable
redis 4.3.0 stable
sqlsrv 5.9.0 stable
xdebug 3.0.4 stable
zip 1.19.2 stable
I just attempted to produce a core file on the server i can reproduce the problem but i don't know what i'm doing:
ulimit -c unlimited
mkdir /srv/php-dump
chown nginx /srv/php-dump
echo "/srv/php-dump/core-%e.%p" > /proc/sys/kernel/core_pattern
cd /etc/php.d
mv mongodb.ini_bak mongodb.ini
service php-fpm restart
# triggered a failure
# checked in /srv/php-dump but it's empty
i was able to generate core files via php-fpm settings:
process.dumpable = yes
rlimit_core = unlimited
i don't think i have things configured correctly to spit out anything useful though.
Edit: I just did what it told me to: debuginfo-install php-cli-7.3.33-1.el7.remi.x86_64
and reran things again to collect new core files and make another attempt at using gdb
but get similar results.
I imagine this has something to do with it:
To get a backtrace with correct information you must have PHP configured with --enable-debug!
I'm not sure how to proceed though.
You're definitely going to need debug headers for all libraries involved in the backtrace ("?? ()" is evidence that no debug symbols are available). That includes PHP itself, any relevant extensions, and OpenSSL. I not familiar with how Remi's packages are compiled, but with respect to ext-mongodb
if you find that no debug symbols are present you can always try installing from PECL directly or fall back to compiling from source (optionally specifying --mongodb-developer-flags
to configure
). That said, if the segfault is happening in OpenSSL when you're not using the driver directly it's likely that ext-mongodb
isn't even part of the backtrace, so I'd focus on PHP and OpenSSL for now.
PHP's own docs for GDB backtraces may provide some guidance: https://bugs.php.net/bugs-generating-backtrace.php
Another approach may be to try and reproduce this error through the PHP CLI on your production system (or the built-in web server). If that's possible, it should be easier to test and capture a backtrace than having to go through FPM each time.
Notice: to install the debug information for any package. Usually "gdb" display this command when missing
On Fedora and EL >= 8
dnf debuginfo-install php-cli php-fpm php-pecl-mongodb
On old EL 7
yum install --enablerepo=remi-php73-debuginfo php-debuginfo
@jmikola for RPM distribution, binaries are never stripped, but debug data moved to a separate package to reduce installation size (most of users don't need them)
@remicollet thank you!
i was able to do that and now i see this.
[root@server php-dump]# gdb core-php-fpm.14947
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-120.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
[New LWP 14947]
Reading symbols from /usr/sbin/php-fpm...Reading symbols from /usr/lib/debug/usr/sbin/php-fpm.debug...done.
done.
Missing separate debuginfo for /usr/lib64/php/modules/imagick.so
Try: yum --enablerepo='*debug*' install /usr/lib/debug/.build-id/b6/9e58a2a43b0bbd73c5b01eae8b415f456b30c4
Missing separate debuginfo for /usr/lib64/php/modules/mongodb.so
Try: yum --enablerepo='*debug*' install /usr/lib/debug/.build-id/df/05eb874abbd5cab9fa8c1fcab40e8ecd2f1090
Missing separate debuginfo for
Try: yum --enablerepo='*debug*' install /usr/lib/debug/.build-id/96/026caecea45facf92d43869a81eb2821adf086
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `php-fpm: pool pool2 '.
Program terminated with signal 11, Segmentation fault.
#0 EVP_MD_CTX_cleanup (ctx=ctx@entry=0x0) at digest.c:418
418 if (ctx->digest && ctx->digest->cleanup
Missing separate debuginfos, use: debuginfo-install ImageMagick-6.9.10.68-6.el7_9.x86_64 audit-libs-2.8.5-4.el7.x86_64 bzip2-libs-1.0.6-13.el7.x86_64 cyrus-sasl-lib-2.1.26-23.el7.x86_64 elfutils-libelf-0.176-5.el7.x86_64 elfutils-libs-0.176-5.el7.x86_64 expat-2.1.0-12.el7.x86_64 fontconfig-2.13.0-4.3.el7.x86_64 freetype-2.8-14.el7_9.1.x86_64 fribidi-1.0.2-1.el7_7.1.x86_64 gd-last-2.3.3-2.el7.remi.x86_64 glib2-2.56.1-9.el7_9.x86_64 graphite2-1.3.10-1.el7_3.x86_64 harfbuzz-1.7.5-2.el7.x86_64 jbigkit-libs-2.0-11.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 lcms2-2.6-3.el7.x86_64 libICE-1.0.9-9.el7.x86_64 libSM-1.2.2-2.el7.x86_64 libX11-1.6.7-4.el7_9.x86_64 libXau-1.0.8-2.1.el7.x86_64 libXext-1.3.3-3.el7.x86_64 libXpm-3.5.12-1.el7.x86_64 libXt-1.1.5-3.el7.x86_64 libacl-2.2.51-15.el7.x86_64 libattr-2.4.46-13.el7.x86_64 libc-client-2007f-16.el7.x86_64 libcap-2.22-11.el7.x86_64 libcap-ng-0.7.5-4.el7.x86_64 libcurl-7.29.0-59.el7_9.1.x86_64 libgcrypt-1.5.3-14.el7.x86_64 libgpg-error-1.12-3.el7.x86_64 libidn-1.28-4.el7.x86_64 libjpeg-turbo-1.2.90-8.el7.x86_64 liblzf-3.6-7.el7.x86_64 libmcrypt-2.5.8-13.el7.x86_64 libpng-1.5.13-8.el7.x86_64 libraqm-0.7.0-4.el7.x86_64 libselinux-2.5-15.el7.x86_64 libssh2-1.8.0-4.el7.x86_64 libtiff-4.0.3-35.el7.x86_64 libtool-ltdl-2.4.2-22.el7_3.x86_64 libuuid-2.23.2-65.el7_9.1.x86_64 libwebp7-1.0.3-1.el7.remi.x86_64 libxcb-1.13-1.el7.x86_64 libxslt-1.1.28-6.el7.x86_64 libzip5-1.8.0-2.el7.remi.x86_64 libzstd-1.5.2-1.el7.x86_64 lz4-1.8.3-1.el7.x86_64 msodbcsql17-17.8.1.2-1.x86_64 nspr-4.32.0-1.el7_9.x86_64 nss-3.67.0-4.el7_9.x86_64 nss-softokn-freebl-3.67.0-3.el7_9.x86_64 nss-util-3.67.0-1.el7_9.x86_64 oniguruma5php-6.9.7.1-1.el7.remi.x86_64 openldap-2.4.44-24.el7_9.x86_64 openssl11-libs-1.1.1k-2.el7.x86_64 pam-1.1.8-23.el7.x86_64 pcre-8.32-17.el7.x86_64 php-pecl-apcu-5.1.21-1.el7.remi.7.3.x86_64 php-pecl-igbinary-3.2.7-1.el7.remi.7.3.x86_64 php-pecl-mcrypt-1.0.4-1.el7.remi.7.3.x86_64 php-pecl-redis4-4.3.0-2.el7.remi.7.3.x86_64 php-pecl-zip-1.20.0-1.el7.remi.7.3.x86_64 php-sqlsrv-5.9.0-1.el7.remi.7.3.x86_64 sqlite-3.7.17-8.el7_7.1.x86_64 sssd-client-1.16.5-10.el7_9.11.x86_64 systemd-libs-219-78.el7_9.5.x86_64 tideways-xhprof-5.0.2-1.x86_64 unixODBC-2.3.7-1.rh.x86_64 xz-libs-5.2.2-1.el7.x86_64
(gdb) bt
#0 EVP_MD_CTX_cleanup (ctx=ctx@entry=0x0) at digest.c:418
#1 0x00007f744cd995e9 in EVP_MD_CTX_free (ctx=0x0) at /var/tmp/mongodb/src/libmongoc/src/libmongoc/src/mongoc/mongoc-crypto-openssl.c:51
#2 0x00007f744158ab35 in ssl3_free_digest_list () from /lib64/libssl.so.1.1
#3 0x00007f744158b793 in ssl3_clear () from /lib64/libssl.so.1.1
#4 0x00007f74415c2a29 in tls1_clear () from /lib64/libssl.so.1.1
#5 0x00007f744158b4dc in ssl3_new () from /lib64/libssl.so.1.1
#6 0x00007f74415c29c9 in tls1_new () from /lib64/libssl.so.1.1
#7 0x00007f744159bb81 in SSL_new () from /lib64/libssl.so.1.1
#8 0x00007f7441954d46 in ?? () from /opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.8.so.1.2
#9 0x00007f744194f7a2 in ?? () from /opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.8.so.1.2
#10 0x00007f744195006c in ?? () from /opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.8.so.1.2
#11 0x00007f7441919e8f in ?? () from /opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.8.so.1.2
#12 0x00007f7441917a31 in ?? () from /opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.8.so.1.2
#13 0x00007f7441918594 in ?? () from /opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.8.so.1.2
#14 0x00007f7441886ee7 in ?? () from /opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.8.so.1.2
#15 0x00007f74418bb60e in ?? () from /opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.8.so.1.2
#16 0x00007f74418862ba in SQLDriverConnectW () from /opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.8.so.1.2
#17 0x00007f744d9baf9e in SQLDriverConnectW () from /lib64/libodbc.so.2
#18 0x00007f744df2c3db in core_odbc_connect(sqlsrv_conn*, std::string&, bool) () from /usr/lib64/php/modules/sqlsrv.so
#19 0x00007f744df2ddc6 in core_sqlsrv_connect(sqlsrv_context&, sqlsrv_context&, sqlsrv_conn* (*)(void*, bool (*)(sqlsrv_context&, unsigned int, int, __va_list_tag (*) [1]), void*), char const*, char const*, char const*, _zend_array*, bool (*)(sqlsrv_context&, unsigned int, int, __va_list_tag (*) [1]), connection_option const*, void*, char const*) () from /usr/lib64/php/modules/sqlsrv.so
#20 0x00007f744df2098a in zif_sqlsrv_connect () from /usr/lib64/php/modules/sqlsrv.so
---Type <return> to continue, or q <return> to quit---
@jmikola looks like my error is coming from
/var/tmp/mongodb/src/libmongoc/src/libmongoc/src/mongoc/mongoc-crypto-openssl.c:51
2 0x00007f744158ab35 in ssl3_free_digest_list () from /lib64/libssl.so.1.1
Looks like you have openssl11-libs installed (and used by libmsodbcsql) This is terribly wrong, mixing openssl 1.0 and 1.1 in the same process cannot work, and will raise such issue.
Try to remove it.
From a quick look, I only see nginx (from EPEL) pulling it.
Nothing from PHP stack should use it How PHP extensions are installed ?
(using my repo, "yum install php-redis php-sqlsrv php-mongodb ..." only openssl 1.0 is used)
I believe pecl
was used to install some of the extensions. there was no one installing php extensions when this issue arose though which is why I'm perplexed. is there a way to check when a package(openssl11) was installed on my machine? is it in the realm of possibility that anti-virus software or something else would take it upon itself to install that?
side note: I'm at the hospital with my wife who is having a baby today so I won't be able to do much with this today otherwise I would be quick to respond since I really appreciate you both helping me
@jmikola I notice libmongoc / libmongocrypt have some internal functions from openssl 1.1 when build with openssl 1.0
From an objdump
000000000008a210 g DF .text 0000000000000012 Base EVP_MD_CTX_free
00000000000e05b0 g DF .text 0000000000000012 Base EVP_CIPHER_CTX_free
000000000008a200 g DF .text 000000000000000a Base EVP_MD_CTX_new
00000000000e05a0 g DF .text 000000000000000a Base EVP_CIPHER_CTX_new
=> https://github.com/mongodb/mongo-c-driver/blob/master/src/libmongoc/src/mongoc/mongoc-crypto-openssl.c#L41 => https://github.com/mongodb/libmongocrypt/blob/master/src/crypto/libcrypto.c#L36
It will be nice to make such internal functions private, and hide them (ex in above backtrace, you can see that openssl 1.1 call the function from libmongoc, instead of its own implementation, which seems terrible)
I believe pecl was used to install some of the extensions.
If you use my repo for php, also use it for extensions... most of them are available.
If think openssl11-devel was badly installed at some time, so build from sources (pecl command) of one ext use it.
is there a way to check when a package(openssl11) was installed on my machine?
rpm -qi packagename => "Install Date" also "yum history"
And "yum remove" will tell you is something really need it
side note: I'm at the hospital with my wife who is having a baby today so I won't be able to do much with this today otherwise I would be quick to respond since I really appreciate you both helping me
Congratulations! Hope everything goes smoothly and I appreciate your patience while we sort this out.
And huge thanks to @remicollet for magically appearing in this thread :)
I notice libmongoc / libmongocrypt have some internal functions from openssl 1.1 when build with openssl 1.0
I agree this is problematic. The intention here was obviously just to polyfill the OpenSSL 1.1 functions but the fact that these are global symbols invites them to be called from places outside of libmongoc, which is certainly a problem.
000000000008a210 g DF .text 0000000000000012 Base EVP_MD_CTX_free 00000000000e05b0 g DF .text 0000000000000012 Base EVP_CIPHER_CTX_free 000000000008a200 g DF .text 000000000000000a Base EVP_MD_CTX_new 00000000000e05a0 g DF .text 000000000000000a Base EVP_CIPHER_CTX_new
It looks like these functions originate from both libmongoc and libmongocrypt. I'll file JIRA tickets in both projects to propose we address this (either static declarations or prefixed names, as we do for some libbson polyfills).
sorry for my ignorance...as far as actions for me to attempt when I am able...is attempting to uninstall openssl11 something that makes sense?
yum remove openssl11-devel
^ based on remi's reply earlier
sorry for my ignorance...as far as actions for me to attempt when I am able...is attempting to uninstall openssl11 something that makes sense?
yum remove openssl11-devel
^ based on remi's reply earlier
To remove the library, yum remove openssl11-libs To remove the devel, yum remove openssl11-devel
Removing the headers will avoid using it, but you will have to reinstall everything build with it Removing the library will probably break things... btw things are already broken ;)
see https://github.com/mongodb/libmongocrypt/pull/251 and https://github.com/mongodb/mongo-c-driver/pull/946
@remicollet: Thanks for the PRs. I just reported CDRIVER-4297 and MONGOCRYPT-383 and will cross-reference those now.
I left some comments on your PRs, as there are a few extra cases of polyfill declarations that were missed (which I found while reporting those issues).
lol well I've since disabled the extension so things aren't completely broken. we only use Mongo to store xhprof profiles so that's the only thing not working.
I'm scared removing openssl11 will cause my semi-functioning setup to break. it seems like libmsodbcsql
is also using it? we use mssql and this seems like removing that could potentially affect the sql srv driver(odbc)? idk. this stuff is so much more lower level than what I have an understanding for
I don't know what to try to resolve this and be able to use the extension again
@dadamssg sorry, but your setup seems a mess... and indeed cleaning it will be risky. Perhaps safer to reinstall from scratch a clean environment.
And again, building from sources on a prod server is always a very bad idea.... especially when RPM packages exist. Q.E.D.
libmsodbcsql is not using openssl (directly)
You can run something like ldd /usr/lib64/php/modules/*.so
to see what is using libcrypto.so.1.1
or libssl.so.1.1
So, you have to reinstall properly each extension using openssl 1.1
@dadamssg: I've just released 1.13.0, which includes the necessary bug fixes in libmongoc and libmongocrypt. Apologies for the delay, but we ran into some unexpected issues preparing the libmongocrypt 1.3.2 release.
Upgrading ext-mongodb
should address everything mentioned above, but please let me know if that's no the case and we can re-open and investigate further.
@jmikola awesome. thank you!
@jmikola this release works for us and our situation. thank you so much. really appreciate you digging into this with me.
Bug Report
I have 3 servers running php with the mongo extensions installed. On Feb 13, all of them encountered the same error at the same time. The error being a segfault coming from libcrypto.so.1.0.2k.
The odd part(aside from all the servers experiencing the same thing at the same time) is that the code i'm running is not even using the mongo extension. We use JWT tokens for auth. This crash seems to come up only on api requests that are interacting with a crypto algorithm, to either encode or decode a JWT token. I disabled the mongo extension, restarted php-fpm, and things started working normally again. Thankfully i'm just using mongo to store xhprof profiles on demand but i have no idea what changed on all 3 of my servers at the same time to cause this.
In my debugging, i wrote a simple test script that generates a JWT token using the same library. i was able to run this script fine from the command line with the mongo extension enabled. It appears a magic combination of the following causes the error:
I was able to make requests against my public api endpoints(that don't require any crypto) with the mongo extension enabled and those responded fine.
I develop in docker and i'm unable to reproduce the issue in my development container however i can still reproduce it in my actual servers.
Environment
Expected and Actual Behavior
Debug Log