JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.73k stars 5.48k forks source link

Drop mbedTLS and migrate to OpenSSL #48799

Open fxcoudert opened 1 year ago

fxcoudert commented 1 year ago

There has already been an issue that proposed migration to BoringSSL (https://github.com/JuliaLang/julia/issues/45856), which is not what I propose here.

From a security perspective, this seems great (fewer updates to manage, and track record shows mbedTLS is frequently behind in Julia).

Are there any downsides to migrating to OpenSSL and removing mbedTLS?

If not, and the idea has support, I volunteer to handle the migration PRs in Yggdrasil and julia.

giordano commented 1 year ago

Julia already depends on OpenSSL

Uhm, I don't think Julia itself does. Some packages in the ecosystem do.

Are there any downsides to migrating to OpenSSL and removing mbedTLS?

Isn't OpenSSL plagued by security vulnerabilities? Last October all stable versions were withdrawn because of high security vulnerabilities, and the fix came only three weeks afterwards: https://www.openssl.org/news/newslog.html.

Pinging some people who may have opinions: @mkitti @eschnett @quinnj

DilumAluthge commented 1 year ago

Also cc: @StefanKarpinski @staticfloat

mkitti commented 1 year ago

I have significantly soured on mbedTLS in that they have no commitment to maintain the binary interface between minor releases.

With OpenSSL moving to semantic versioning with the version 3 release. I think we should seriously consider using OpenSSL v3 instead of mbedTLS. We need to finish moving the ecosystem to OpenSSL v3 first though.

In a similar vein, we should also consider libssh instead of libssh2.

mkitti commented 1 year ago

xref: https://github.com/JuliaLang/julia/issues/43677#issuecomment-1136602830

Here is an excerpt from the mbedtls release notes:

Some fields of mbedtls_ssl_session and mbedtls_ssl_config are in a different order. This only affects applications that define such structures directly or serialize them.

https://github.com/Mbed-TLS/mbedtls/releases/tag/v2.28.0

fxcoudert commented 1 year ago

My 20 cents: OpenSSL has had hiccups in the past, but it seems to me much more stable and backward compatible that mbedTLS. It's also used more widely: at Homebrew, we have more than 10% of our software that depends directly on OpenSSL (and the majority of the packages have some indirect dependence on it).

mkitti commented 1 year ago

mbedTLS long term support

https://github.com/Mbed-TLS/mbedtls/blob/development/BRANCHES.md

We use Semantic Versioning. In particular, we maintain API compatibility in the master branch across minor version changes (e.g. the API of 3.(x+1) is backward compatible with 3.x). We only break API compatibility on major version changes (e.g. from 3.x to 4.0). We also maintain ABI compatibility within LTS branches; see the next section for details.

For the LTS branches, additionally we try very hard to also maintain ABI compatibility (same definition as API except with re-linking instead of re-compiling) and to avoid any increase in code size or RAM usage, or in the minimum version of tools needed to build the code. The only exception, as before, is in case those goals would conflict with fixing a security issue, we will put security first but provide a compatibility option. (So far we never had to break ABI compatibility in an LTS branch, but we occasionally had to increase code size for a security fix.)

mbedTLS Summary

OpenSSL long term support

https://www.openssl.org/policies/releasestrat.html

As of release 3.0.0, the OpenSSL versioning scheme is changing to a more contemporary format: MAJOR.MINOR.PATCH With this format, API/ABI compatibility will be guaranteed for the same MAJOR version number. This more closely aligns with the expectations of users who are familiar with semantic versioning. However, we have not adopted semantic versioning in the strict sense of its rules, because it would mean changing our current LTS policies and practices. Version 3.0 will be supported until 2026-09-07 (LTS). We may designate a release as a Long Term Support (LTS) release. LTS releases will be supported for at least five years and we will specify one at least every four years. Non-LTS releases will be supported for at least two years.

No API or ABI breaking changes are allowed in a minor or patch release. The following stability rules apply to all changes made to code targeted for a major release from version 3.0.0 or later:

  • No existing public interface can be modified except where changes are unlikely to break source compatibility or where structures are made opaque.
  • No existing public interface can be removed until its replacement has been in place in an LTS stable release. The original interface must also have been documented as deprecated for at least 5 years. A public interface is any function, structure or macro declared in a public header file.
  • When structures are made opaque, any newly required accessor macros or functions are added in a feature release of the extant LTS release and all supported intermediate successor releases.

OpenSSL Summary

ViralBShah commented 1 year ago

A major problem we had was not sticking with their LTS versions. I believe 2.28 is an LTS version.

mkitti commented 1 year ago

Yes, but the problem is that "long term" for mbedtls is less than 2 years from now. Will we release another Julia LTS by then?

ViralBShah commented 1 year ago

I imagine we most likely will. Just a personal view.

joa-quim commented 1 year ago

GMT doesn't run on some linuxes because external binaries may depend on OpenSSL https://github.com/JuliaLang/julia/issues/48419

mkitti commented 1 year ago

I'm not sure if this will solve GMT's issues regarding conflicts with system libraries. You might have to contend with multiple versions of OpenSSL libraries then.

The only ones who can really solve that issue are the system package managers. The alternative is to create our own "system" (e.g. the BinaryBuilder JLLs, containers, or conda-forge).

joa-quim commented 1 year ago

That problem is occurring with binaries built with Conda.

mkitti commented 1 year ago

Right. So conda (probably conda-forge) would need to figure out how to build julia with libraries configured to fit in the conda-forge ecosystem.

mkitti commented 1 year ago

In conda-forge, I think they might build with curl, libgit2, and libssh2 that uses OpenSSL v3, so maybe that works with GMT?

I happen to be one of the contributors to the julia-feedstock.

https://github.com/conda-forge/julia-feedstock/blob/main/recipe/meta.yaml

https://github.com/conda-forge/julia-feedstock/blob/main/recipe/build.sh#L49

mkitti commented 1 year ago

Apparently mbedtls is really slowing down connections.

https://discourse.julialang.org/t/http-jl-async-is-slow-compared-to-python-aiohttp/96736/45?u=mkitti

PallHaraldsson commented 1 year ago

mbedlts version 2.28 is current LTS, and is supported for 3 years. 2.28 was released in December 2021 release and will be supported until December 2024.

That means we can't use that LTS for 1.10, assuming 1.10 will become LTS, nor can we use any non-LTS mbedlts I believe. Does anyone know what might become the next LTS for it? Strictly speaking we can use any TLS in 1.10, we just promise to upgrade in our minor LTS Julia versions. Is that possible and keeping compatibility?

OpenSSL version 3.0 is LTS will be supported until 2026-09-07 (LTS).

That seems better, we might not want to promise longer support for our Julia (next) LTS. We have actually never promised any time-frame that I know of, just "long". I'm agnostic what would be the replacement. I'm not opposed to OpenSSL since it seems good, and is already installed on most Linux distros. I think we should actually support none, use what's already installed on your platform, or only use ours as a fallback if newer. Is the plan to e.g. use OpenSSL also on Windows? Do we already use mbedlts there, or what Windows provides? Some argument could be made (for e.g. Windows) that it's less secure, broken by CIA or NSA, and not to be trusted... While I'm not too paranoid, proposing trusting them, maybe an ENV var should allow insisting on our bundled TLS...

Are there any downsides to migrating to OpenSSL and removing mbedTLS?

I have significantly soured on mbedTLS

I feel a bit responsible since I suggested mbedTLS in the beginning. It seemed good, now things may have changed and OpenSSL or BoringSSL (and also HTTP/3) better, but I would prefer none of it, also Downloads excised, in Julia to reduce maintenance burden...

Pkg is on its way out of the sysimage, and hopefully out of the new juliax eventually, and Downloads. Can we get away with all of this gone, and just in an upgradable (stdlib) library and/or rely on the system TLS only? Are there any platforms with no TLS/SSL?! Who are we helping including, not @Seelengrab or other for (not, yet, supported) embedded?

Seelengrab commented 1 year ago

Are there any platforms with no TLS/SSL?! Who are we helping including, not @Seelengrab or other for (not, yet, supported) embedded?

Uuuhh I'm not aware of either mbedTLS or OpenSSL being available in particular for embedded. They'd have to be written with that in mind anyway. I think most of that would happen in userspace anyway, so should not be special to embedded. No need to consider my esoteric usecases here.

staticfloat commented 1 year ago

Uuuhh I'm not aware of either mbedTLS or OpenSSL being available in particular for embedded.

mbedTLS is designed for embedded use cases, that's why it's got mbed in the name (after the embedded platform it was originally designed for). It's got pretty wide platform support.

vchuravy commented 1 year ago

Of all goes well mbedtls_jll will be an upgradeable stdlib for 1.11. #51399 moves it out of the sysimg.

Seelengrab commented 1 year ago

mbedTLS is designed for embedded use cases, that's why it's got mbed in the name

Good point, I hadn't made that connection! It being available in theory and being able to use it through Julia on a microcontroller are still two very different things though :) So there's no need to consider my niche for a decision on what the core runtime should use.

Of all goes well mbedtls_jll will be an upgradeable stdlib for 1.11. https://github.com/JuliaLang/julia/pull/51399 moves it out of the sysimg.

Nice, so that should alleviate worries about shipping an outdated version!


One thing that remains (from what I can tell) are the performance problems reported on discourse - has someone investigated/pinged the mbedTLS people what could be the issue? Or is this expected behavior?

Seelengrab commented 1 year ago

Seems like @quinnj did a similar investigation into performance here.

quinnj commented 1 year ago

Yeah, my understanding is that OpenSSL (and other commercially used TLS libraries: boringSSL, aws crt's tls library, libressl, etc.) have custom assembly kernels for certain performance hotspots that mbedtls doesn't have and is thus slower, yet easier to use/port to a wider variety of platforms.

mkitti commented 1 year ago

The problem is that there is a mismatch in objectives. Julia is not currently targeting the embedded case, yet mbedTLS is the default.

How could we produce an alternate build of Julia that depends on OpenSSL? Should we start with alternate dependencies that build against OpenSSL?

gbaraldi commented 1 year ago

Since there is a jll already one would need to see the uses of mbedtls and switch them to openssl.

eschnett commented 1 year ago

The packages currently using mbedTLS are:

Of these, I assume that only the last four are important for Julia itself. Many of the others are only using mbedTLS to stay transitively consistent.

giordano commented 1 year ago

I'm moderately sure most of them link to mbedtls only because of libcurl (libcurl config file will have -lmbedtls and so we need to pull in mbedtls_jll as well), not because they need mbedtls itself.

fxcoudert commented 1 year ago

libssh2 and libgit2 need a crypto/tls backend, not just through curl (actually, I'm not sure they link to curl at all).

Edit: I've checked, and indeed, they don't need curl but they need crypto/tls. In Homebrew, we've been building them against openssl forever, and have no issue. We also build julia against our homebrew openssl-based libgit2 and libssh2 and have never had issues with it.

giordano commented 1 year ago

libssh2 and libgit2 need a crypto/tls backend, not just through curl (actually, I'm not sure they link to curl at all).

That's why I said "most of them", not "all of them" :slightly_smiling_face:

PallHaraldsson commented 1 year ago

FYI: There are 35 direct dependants of MbedTLS (as apposed to its JLL), such as AWS*. Don't we need to worry about them too? Or neither, since both will just continue to work?

Only MbedTLS is in the general registry currently, not the JLL since it's a stdlib (I think upgradable stdlibs may be there). For Project.toml files it doesn't matter packages are there(?), i.e. their origin.

mkitti commented 1 year ago

The MbedTLS JLL will likely need to remain an "upgradable" standard library at minimum similar to DelimitedFiles.jl.

PallHaraldsson commented 1 year ago

mbedtls_jll will be an upgradeable stdlib for 1.11. #51399 moves it out of the sysimg.

How about dropping mbedtls[_jll] (and CURL) and have no replacement? It's already out of the sysimage, so where is it actually used?

I believe (only by the deprecated) Downloads.jl (indirectly by CURL) which is already an upgradable stdlib, bundled with, could be unbundled in 2.0. A lot of programs need to download, yes, but not nearly all, and some need to upload too that I think it doesn't support anyway. Such functionality should be only in a package? Then less of a security risk for Julia, i.e. reasons to upgrade it, and maintain LTS. [It will be easier to upgrade for security, if you only need to upgrade a package, and ideally that package should auto-update itself. Pkg will depend on it.]

It's used by Pkg indirectly, i.e. for git I believe, but both are implementation details, so could change in a minor release. Note, we would still bundle it, it would just not be part of any official API.

If we do it this way, it would be good to lose it in 1.10 already (i.e. not promise it for LTS).

StefanKarpinski commented 12 months ago

The question is whether libcurl and libgit2 (if we still use it) is configured to use mbedtls or openssl.

mkitti commented 11 months ago
  1. We should allow either to be used with those dependencies. That is we should not assume that libcurl and libgit2 depend on a particular SSL implementation.
  2. By default, we should have libcurl and libgit2 depend on OpenSSL.

The conda-forge build (which I try to maintain) does use libcurl and libgit2 that depends on OpenSSL on Linux. I'm not completely sure if that is fully functional.

StefanKarpinski commented 11 months ago

The JLLs for libcurl and libgit2 have to be built one way or the other and they have to depend on JLLs for either mbedtls or openssl, so we can't really allow either except for letting people use preferences to opt out of using JLLs for those libraries altogether.

SMillerDev commented 8 months ago

FYI: the current mbedTLS version in use by Julia will end support at the end of the current year.

mkitti commented 8 months ago

The note in BRANCHES.md says

mbedtls-2.28 maintained until at least the end of 2024

As far as I can tell, they have not yet declared another branch to be LTS.

SMillerDev commented 7 months ago

They have not, but it seems risky to wait for them to deprecate the current branch

fxcoudert commented 7 months ago

As of one week ago, mbedtls-3.6 is now designated LTS branch: https://github.com/Mbed-TLS/mbedtls/blob/development/BRANCHES.md

There is no release on the 3.6 branch, however :)

PallHaraldsson commented 7 months ago

Did we already drop MbedTLS as of recent PR? And then just close this issue?

That said there is now Mbed TLS 3.6.0 LTS in case preferred... at: https://github.com/Mbed-TLS/mbedtls/releases

I'm not sure the number of improvements make us reconsider dropping MbedTLS (is it speed-critical at all for what Julia/libssh2 itself needs?), such as TLS1.3 support and:

AES performance improvements. Uplift varies by platform, toolchain, optimisation flags and mode. Aarch64, gcc -Os and CCM, GCM and XTS benefit the most. On Aarch64, uplift is typically around 20 - 110%. When compiling with gcc -Os on Aarch64, AES-XTS improves by 4.5x.

gojimmypi commented 3 months ago

Have you considered using wolfssl? It is vastly superior to both openSSL and mbedTLS in every way: size, speed, flexibility. Plus certified: NIST, DO-178, etc.

It's a serious, commercial-grade library with solid TLS 1.3, post quantum, SM (ShangMi; essential for business in China), and more. I have a brief comparison in https://github.com/espressif/esp-idf/issues/13966

If there's already OpenSSL support, then wolfSSL has a compatibility layer. See also https://www.wolfssl.com/docs/wolfssl-openssl/

There are wolfcrypt wrappers for many languages. Would be great to see one for Julia!