python / cpython

The Python programming language
https://www.python.org
Other
62.68k stars 30.06k forks source link

Signal if there is an OpenSSL major version mismatch #99951

Open mmomtchev opened 1 year ago

mmomtchev commented 1 year ago

Feature or enhancement

Implement some kind of obvious warning or even fatal error when the runtime OpenSSL major version does not match the one Python was compiled against.

Pitch

This is especially valid when using the embedded interpreter. I am developing a Node.js/Python compatiblity layer (https://github.com/mmomtchev/pymport) and Node.js exports the OpenSSL symbols it was built with. Python on Ubuntu 22.04 was compiled with libssl3 and it can't be used in any software compiled with libssl1. There is no magic solution to this problem, but printing a warning or even quitting with a fatal error would avoid wasting many hours debugging very hard to find crashes.

As we are currently right in the middle of the painful switch from OpenSSL 1 to OpenSSL 3, I am sure many will be thankful for this feature.

Previous discussion

This is a two-liner

Linked PRs

moonsikpark commented 1 year ago

This could be done manually using the following code snippet.

import ssl
assert ssl.OPENSSL_VERSION_INFO[:2] == ssl._OPENSSL_API_VERSION[:2], "OpenSSL major version mismatch!"

Because I agree this could be useful to have this check globally when users are importing the ssl module, I've opened a PR (python/cpython#100641) that addresses it.

eli-schwartz commented 1 year ago

Python on Ubuntu 22.04 was compiled with libssl3 and it can't be used in any software compiled with libssl1. There is no magic solution to this problem, but printing a warning or even quitting with a fatal error would avoid wasting many hours debugging very hard to find crashes.

Openssl already builds with symbol versioning to enable this. libssl3 should have symbols named foo@OPENSSL_3.0.0 and libssl1 should have symbols named foo@OPENSSL_1_1_0.

I would argue this is a bug in Node.js, the openssl util/mkdef.pl even supports customizing the names specifically so that people embedding openssl can use private names that don't conflict with other libraries in the same process.

mmomtchev commented 1 year ago

Node.js has always included a statically linked OpenSSL with re-exported symbols which are made available to its binary addons - this is part of its API. There are other ways to build it, but since the official binary packages (those by Nodesource) are built this way, mostly everyone else follows suit - to avoid shipping a version that won't be binary compatibly with the existing packages.

On all systems that I checked, not only Node.js, but also the default, system-provided libssl1 is not built with @OPENSSL_1_1_0 suffix. Maybe this a build option that no one really used before 3.0 was available? Is a libssl3 built with the suffix supposed to be compatible - in the sense that both can be present in the same shared memory of a process - with libssl1? Or does this apply only if both of them have the suffix?

eli-schwartz commented 1 year ago

On all systems that I checked, not only Node.js, but also the default, system-provided libssl1 is not built with @OPENSSL_1_1_0 suffix. Maybe this a build option that no one really used before 3.0 was available?

It's present in all systems that I checked, and having some value in the generated ld version script is not optional (it's hardcoded into util/mkdef.pl) -- although a target configuration can change it to OPENSSL_FOOBAR_1_1_0 instead by defining the variant -foobar (this also causes the openssl build system to generate the library name libssl-foobar.so.1.1). But that assumes that the version script is used.

I checked Debian, Ubuntu, Arch Linux, Alpine, and Fedora.

Is a libssl3 built with the suffix supposed to be compatible - in the sense that both can be present in the same shared memory of a process - with libssl1? Or does this apply only if both of them have the suffix?

The way this works is that when libssl3 has versioned symbols, then:

So, if CPython is linked with a libssl that does versions, then there's simply nothing to check, cpython's _ssl module is always correct, regardless of whether Node.js crashes.

If Node.js does versions in its internal ssl, then Node.js extensions linked to that should always be correct, regardless of whether CPython crashes.

eli-schwartz commented 1 year ago

Node.js has always included a statically linked OpenSSL with re-exported symbols which are made available to its binary addons - this is part of its API. There are other ways to build it, but since the official binary packages (those by Nodesource) are built this way, mostly everyone else follows suit - to avoid shipping a version that won't be binary compatibly with the existing packages.

Off-topic rant:

Node.js should not have done this, it causes huge distributor problems and also means that Node.js itself cannot upgrade OpenSSL except when the Node.js ABI itself changes. But they don't care about the former and the latter is considered an acceptable limitation.

The people that pay are the ones building system Node.js that cannot conform to platform policy -- and also the people trying to embed Node.js in other software that already uses OpenSSL, or vice versa. Which is exactly what's happening here.

What Node.js could have and should have done is wrapped the OpenSSL API in a node-specific API and re-exported that, so that node extensions don't link to openssl at all, they just link to node, and the actual version of openssl which node uses becomes an internal implementation detail that does not matter and can be freely swapped out by anyone that wants to. For example you could just build both python and node against the same libssl.

Whatever, it is what it is and python as well as every other program out there just has to live with it if they want to interface with node. :(

mmomtchev commented 1 year ago

@eli-schwartz I will reproduce the crash and I will give you more details, my setup is _ssl linked vs the system-provided libssl3 in Ubuntu 22.04 loaded by libpython loaded by Node.js 16 that includes a statically built OpenSSL 1.1 with exported symbols.

The crash occurs in OpenSSL 1.1 code called by _ssl.

The Node.js decision is indeed somewhat controversial, but I have always liked it, because, as a binary addon author, it allows me to easily ship an addon that includes for example libcurl built with the Node.js OpenSSL. This wouldn't be possible if Node.js wrapped the OpenSSL API.

mmomtchev commented 1 year ago

@eli-schwartz well, the answer is it does not work. My crash is hashlib.cpython trying to call EVP_DigestInit_ex@OPENSSL_3.0.0 and ends calling EVP_DigestInit_ex from OpenSSL 1.x statically built in Node 16. This ends up with calling a null pointer because a field in a structure is not initialized (or it does not carry the same meaning).

The first point is that his can never actually work - there is no dynamic linker mechanism which allows you to load multiple versions of the same shared library. The versioning mechanism of ld allows you to have a single shared library which exports multiple versioned symbols.

The real question however is why the silent error - since in this particular case ld is supposed to raise an error.

I did spend some time researching how versioned symbol resolving works - and the documentation is very scarce - so my only reference is this ancient document from Ulrich Drepper (the author of ld/glibc) - in which section Symbol lookup confirms this behavior. If a library does not use versioning at all, then an unversioned symbol will be used to link a versioned symbol.

I have also found various ld options which can influence the resolving behavior, but nothing that will turn this mismatch into an error. If there is anyone who is aware of such an option - this would probably be the best solution.

Bottom line is, if your Python is compiled vs OpenSSL 3.0.0 - which is the case in most very recent Linux distributions and the macOS homebrew - than using embedded Python in an application compiled vs OpenSSL 1.x will lead to various random crashes.