volta-cli / rfcs

RFCs for changes to Volta
Other
17 stars 14 forks source link

Switching to a Statically-Linked TLS implementation #47

Closed charlespierce closed 1 month ago

charlespierce commented 2 years ago

Proposal to switch to a statically-linked TLS implementation for HTTPS downloads via the Rustls crate.

Rendered

charlespierce commented 2 years ago

Something I missed, which I'll add in to the RFC: From this PR https://github.com/volta-cli/volta/pull/896, it appears that ring (which is a dependency of rustls) doesn't build on CentOS 6, so making this change may require dropping support for CentOS 6 (or splitting support so that the older versions continue to rely on OpenSSL)

jensmeindertsma commented 2 years ago

@charlespierce I'd love for this to become a reality! What are your thoughts on CentOS? Should we just stick with Docker for CentOS? We just need to make the install script a little smarter such that it will pick the CentOS download when we're on openssl version rhel correct? If that's not the openssl version we can just pick the single linux build that uses rustls? How do you think we should approach this on the cargo build side? According to https://doc.rust-lang.org/cargo/reference/features.html#command-line-feature-options we can disable the rustls feature for RHEL builds in build-and-package.sh. That seems fairly easy to do.

The "easier" alternative is to release Volta 2 which only supports what ring supports? What do you think about this?

jensmeindertsma commented 2 years ago

I investigated whether rustls will work with MUSL and ARM, and I've come to the conclusion that we still need to build the Linux binaries inside a Docker container. We need QEMU emulation to compile the aarch64 and musl targets.

Let me know what you think 😊

jensmeindertsma commented 2 years ago

Also, thanks so much for sticking with me through all these PR's. I love Volta and I'd love to help improve it. My wish is to get Volta running on Alpine and to accomplish that I'd love to do the work here. I would be super happy to contribute this to Volta. We just need to figure out some details ( read messages above ) and then I can work on the implementation.

jensmeindertsma commented 2 years ago

Something I missed, which I'll add in to the RFC: From this PR volta-cli/volta#896, it appears that ring (which is a dependency of rustls) doesn't build on CentOS 6, so making this change may require dropping support for CentOS 6 (or splitting support so that the older versions continue to rely on OpenSSL)

Since CentOS is really just x86_64-unknown-linux-gnu, can't we reuse the build from x86_64-unknown-linux-gnu for CentOS installations now that we're using rustls? We won't have to build a separate image for CentOS anymore?

My CentOS knowledge is lacking to confirm this, maybe someone else knows this? Since we're bundling rustls, can't we just install the x86_64-unknown-linux-gnu binary on CentOS?

jensmeindertsma commented 2 years ago

https://github.com/volta-cli/volta/pull/1165 here is my implementation of this RFC

jensmeindertsma commented 2 years ago

This change allowed me to easily add MUSL and ARM64 support to Volta, which is amazing. I believe this change will still work on CentOS6 because we now no longer depend on OpenSSL. I'll try out the binary on CentOS 6 before marking the pull request as ready.

charlespierce commented 2 years ago

@jensmeindertsma Thanks for pushing on this! There's (as you've seen!) a lot of complexity here, and it's been a couple of years now since this was set up, so I'm still loading everything back up.

We use CentOS in production builds for two reasons:

  1. Building a version linked against the CentOS / RHEL flavor of OpenSSL
  2. Building a version linked against an older libc (for any older Linux distros)

This change would remove the first, but we still want to maintain the support for older distros. We don't necessarily have to use CentOS any more, but IIRC the oldest version of Linux available natively in GitHub actions was still too new back when we were setting this up. I imagine that has gotten even more true now a couple years later. So we will likely still require a docker build.

Continuing to use CentOS 6 will ensure that we still have the same dynamic lib compatibility, so that's why I would lean towards using the same setup as much as possible (just without the extra 'Compile OpenSSL' step).

FWIW, it looks like it may be possible to get ring to compile under CentOS 6, as long as we upgrade GCC: https://github.com/briansmith/ring/issues/1417

jensmeindertsma commented 2 years ago

I investigated https://github.com/briansmith/ring/issues/1417 but I couldn't get it to work. I'll try some more tomorrow.

jensmeindertsma commented 2 years ago

How old of libc do you want to support? Is there some kind of date up until which you will support CentOS 6?

jensmeindertsma commented 2 years ago

Let's chat more in the PR 👍

charlespierce commented 2 years ago

I don't recall off the top of my head the specific version we support, however I remember doing a pretty big test run of possible Linux versions to make sure that it had wide support, which was how we landed on CentOS 6 as the base.

We don't have a specific date we were aiming for support, however the 1.0 release includes a commitment to stability. I wouldn't feel comfortable reducing our surface of support without it being a breaking (i.e. 2.0) release. At the same time, I believe a 2.0 release would need to be more than "We're dropping support for a set of old linux releases", as there are a few breaking / updating ideas that we still haven't had the opportunity to fully investigate.

Given that, it seems like the best path forward for this RFC would be to maintain the existing build environment, and thus keep the same compatibility support in 1.0. In the future, as part of creating a 2.0 release, we can certainly revisit the matrix of supported versions and potentially update our build. That said, I suspect we'll continue to want a containerized release, since relying on the GitHub actions environment ties our support cadence to GitHub's environment availability, which I don't think we want to do either.

jensmeindertsma commented 2 years ago

Thanks for the detailed explanations, that makes sense 👍

jensmeindertsma commented 2 years ago

This change would remove the first, but we still want to maintain the support for older distros. We don't necessarily have to use CentOS any more, but IIRC the oldest version of Linux available natively in GitHub actions was still too new back when we were setting this up. I imagine that has gotten even more true now a couple years later. So we will likely still require a docker build.

Continuing to use CentOS 6 will ensure that we still have the same dynamic lib compatibility, so that's why I would lean towards using the same setup as much as possible (just without the extra 'Compile OpenSSL' step).

I'm having a hard time building for CentOS 6 due to recurring issues with vault.centos.org as you described here https://github.com/actions/virtual-environments/issues/2264.

Just so I have an idea of what we're talking about, what amount of wide support are we missing out on if we just build on ubuntu-latest instead of CentOS 6?

I ran ldd --version ldd on ubuntu-latest

ldd (Ubuntu GLIBC 2.31-0ubuntu9.2) 2.31

versus CentOS 6

ldd (GNU libc) 2.12

When we link against 2.31, does that mean that we by definition do not support any glibc version below 2.31? I'm new to compilers, this helps me understand 🙏

charlespierce commented 2 years ago

I'm not an expert, but as I understand it, yes that's the case. If we build / link against version 2.31 of glibc, then it will work for future versions in the case of upgrades, but the dynamic loader won't be able to find it on older systems that only have an older glibc installed. Since glibc is backwards-compatible, we want to link against the oldest reasonable version that we can. More specifically, we went with the version included in CentOS 6 (which looks to be 2.12) as the minimum supported version, so to prevent a breaking change we should continue to use that same (or an earlier) version.

Since we won't need to use OpenSSL, however, it should be fine to use a different distro if there is an older Ubuntu or something with better package repo support. The main thing is the glibc version (and we'll probably need to do a bunch of testing on various distros to make sure we aren't breaking things as I wouldn't want to push out a release and then have to walk it back because it turns out we broke something.

charlespierce commented 2 years ago

Given the OpenSSL version issues (especially with multiple installed versions or the openssl binary not representing the proper libraries), I think this should move into Final Comment Period.

Closing the loop on the unresolved questions, my current thoughts are:

Versioning: If we make the switch, should we increment the minor version (i.e. move to Volta 1.1)? Ultimately, since there shouldn't be any visible changes for users, it would be reasonable to make it only a patch version update. On the other hand, having a minor version marker makes it a little easier to disambiguate the difference (e.g. it's slightly easier to remember that Volta 1.1 and above are statically-linked, rather than Volta 1.0.6 and above)

This change seems like it's lining up with the work @mikrostew did to support Yarn 3. While there's still a bit of cleanup on that front, I think the two together easily warrant a 1.1 release of Volta.

Do we need to do any sort of announcement / messaging around the change in underlying implementation?

Given the 1.1 release, it's probably worth writing a blog post detailing the new features, including this change to the TLS implementation and the reasoning behind it.

Does the install script need to maintain the ability to install older versions of Volta, or can we remove the OpenSSL detection logic entirely?

This is more up in the air, but we already have precedent for older versions to require a direct link install (see "Installing Old Versions" here). I think it makes sense to hard cut and simplify the installer script, leaving documentation about how to install previous versions, rather than adding to the complexity.