golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.88k stars 17.65k forks source link

crypto/x509: add SetFallbackRoots and golang.org/x/crypto/x509roots/fallback package #43958

Closed breml closed 1 year ago

breml commented 3 years ago

Proposal

Go programs currently rely on the operating system to provide CA root certificate information in some sort of certificate system store. There are situations, where no such up-to-date CA root certificates are available like (Docker) containers built FROM scratch or out of date environments like poorly maintained or no longer updatable systems (e.g. older hardware appliances).

For some environments it is possible for the user of a Go program to add additional CA root certificates via SSL_CERT_FILE or SSL_CERT_DIR, but this is not the case for all supported environments (e.g. Windows) and it is definitively not user friendly.

Therefore, it is desirable for Go programs to have some mechanism to directly embed CA root certificate information into the program itself, so that they don't have to rely on system store to provide CA root certificates that may be absent (or out of date).

I make the following assumptions, which I think are reasonable:

Given those assumptions, I propose adding a new package crypto/x509/rootcerts. Importing this package (as import _ "crypto/x509/rootcerts") will cause CA root certificates to be embedded in the program. Also, building with the build tag rootcerts will force the package to be imported, and therefore will cause CA root certificates to be embedded in the program. The embedded CA root certificates will be used if and when no certificate system store is available (or the user forces the usage of the embedded data).

Source for the CA Root Certificates

I propose to use the Mozilla Included CA Certificate List, more specifically the PEM of Root Certificates in Mozilla's Root Store with the Websites (TLS/SSL) Trust Bit Enabled as the source for the CA root certificates.

The Mozilla Included CA Certificate List is the source for the CA root certificates embeded in the well known products of the Mozilla Foundation like for example Firefox (web browser) or Thunderbird (email client).

In contrast to most of the other software vendors, Mozilla maintains its Included CA Certificate List publicly and distributes it under an open source license (Mozilla Public License Version 2). This is also the reason why most of the Linux distributions, as well as other free unix derivates and wide spread tools, use this list of CA root certificates as part of their distribution.

Some examples:

In summary in my opinion it is safe to say that the Mozilla Included CA Certificate List is well established and widely used. In fact, if a Go program is run on Linux or an other free Unix derivate, chances are high that the root certificates used by the program are already provided by the Mozilla Included CA Certificate List.

Why include into the Go Standard Library

As the sample implementation (link in Annex below) clearly demostrates, that it is possible to write a 3rd party Go package, which achieves the same goal as the proposed package crypto/x509/rootcerts would. The main difference between a package in the standard library and a 3rd party package is: TRUST.

The root certificates are the top-most certificates in the trust chain and used to ensure the trustworthiness of the certificates signed by them either directly (intermediate certificates) or indirectly (through intermediate certificates). Therefore for a package containing and replacing the root certificates, trust is essential.

The same way, most users of Linux trust the CA root certificates provided by their distribution, it is very likely, that user would trust the CA root certificates provided by a package included in the Go standard library.

Additionally, the possibility to include the CA root certificates during build time, without altering the source code, is not possible with a 3rd party package but only if this package is included into the Go standard library and the build tag is implemented in to Go tool chain.

Update of the CA Root Certificates

The CA Root Certificates included in the standard library are updated with every release of Go (with the current schedule every 6 months). This would work the same way as it currently does for the package time/tzdata. The update frequency of the Included CA Certificate List is roughly every few months (2020: 5 times, 2019: 4 times, according to curl ca extract), which seems to be similar to the update frequency of the time zone data information.

In regards to updating the CA root certificates compiled into a Go binary, the same limitations apply as for the time/tzdata package. The information compiled into a binary is not updated. That being said, for the situations, this package is intended for, it is still an improvement because containers built FROM scratch are also not updated by default and out of date / not updatable systems obviously also do not get updates for the CA root certificates.

Annex

There is a sample implementation of this approach at github.com/breml/rootcerts with some additional reasoning about when to use such a package and what to keep in mind.

This proposal as well as the sample implementation are highly influenced by the proposal #38017 - time/tzdata and the implementation of the package time/tzdata by @ianlancetaylor

cc: @FiloSottile, @katiehockman, @mvdan

mvdan commented 3 years ago

At least from my personal experience, I've used scratch images quite a lot for deployments of "pure Go" services, and I needed ca-certificates more often than I needed tzdata. They both fit the same bill in terms of system files which the standard library might depend on, depending on what the program is doing. A significant amount of what I've written in the past ends up needing ca-certificates in the end (e.g. because of outgoing HTTPS requests), so it's almost a default that I make sure it's available beforehand.

This is all to say - if tzdata was included, I think rootcerts should too, given how I think it's at least just as important.

seankhliao commented 3 years ago

examples of other third party packges that provide this: https://github.com/certifi/gocertifi https://github.com/gwatts/rootcerts https://github.com/alexflint/stdroots

I'm not particularly convinced that having it in std is a good idea, since out of date tzdata is usually just a minor bug but out of date ca certs is a security issue. Maybe somewhere under x/crypto.

mvdan commented 3 years ago

I agree with what the proposal says, though. The current solutions people are using are also potentially vulnerable to not getting new CA roots for a long time. This includes cases where the binary runs on a system that's simply not being updated regularly.

At the end of the day, it's up to the developer to keep up to date with Go versions and rebuild/redeploy their binaries as needed. If that maintenance is not happening, the result is the same whether or not this proposal is accepted and used by them.

breml commented 3 years ago

examples of other third party packges that provide this: https://github.com/certifi/gocertifi https://github.com/gwatts/rootcerts https://github.com/alexflint/stdroots

One small difference between https://github.com/breml/rootcerts and the other 3rd party packages mentioned above is, that it can be used in a "none intrusive" way, because no application code needs to be changed. A simple blank import (import _ "crypto/x509/rootcerts") in package main is enough.

This is implemented the same way it is done in time/tzdata (see: https://github.com/breml/rootcerts/blob/master/rootcerts.go#L28-L34).

rolandshoemaker commented 3 years ago

I think there are two interesting parts to this proposal: implementing an API to set the default root pool, and adding a standard library package that contains some set of roots. While I am generally in favor of the first part, I am somewhat opposed to the second.

I think adding an API to crypto/x509 that allows setting the default root pool (rather than always using the system pool when no roots are specified) could be quite valuable, and would more cleanly and safely provide the functionality you (and some other existing packages) are getting with go:linkname tricks. I think the major downside of adding this kind of functionality though is allowing the user(/developer) to understand where their root pool is being set, and opens up the rabbit hole of dependencies setting a malicious set of roots (although it could be argued that this is already the case with the go:linkname trick, so we'd just be adding a more explicit easy to understand API to replace this inherently unsafe approach).

Speaking as a member of the security team, who would likely end up being responsible for maintaining it, I'm hesitant to implement a root store in the standard library. Even if we are just copying some existing root program, there are a number of somewhat complicated policy enforcement decisions we'd become embroiled in. I'm also not sure what the update process for the store would be, for instance if a root was to become distrusted in the NSS program, would we need to issue a security release of Go in order to provide an updated root store to users? I think if we were to do something like this it'd most likely need to live in a x/ repository as a module so that updates could be made in a less rigorous cadence. (Side note: just using the NSS trust store is not as simple on its face as it seems, while it is possible to extract a list of 'trusted' certificates there are somewhat complex policies around what certificates may be used for what, and a chunk of policy that is implemented in code, meaning that just pulling their list and taking it at face value is likely to result in trusting things we may not always want to trust.)

breml commented 3 years ago

@rolandshoemaker Thanks for your thoughtful feedback.

The main intend of my proposal is a standard library package that contains some set of roots.

What I care about is, a simple, save and trusted way to embed a trusted set of root certificates into a Go binary. I more than once observed inexperienced developers handling the missing ca-certificates.crt problem in awful ways (e.g. committing the file together with the code, where it gets forgotten and therefore is never updated). To be able to import the root certs in the same simple and straight forward way as the time zone data is very compelling.

Go positions it self as the language of choice for the cloud and cloud native applications. To cite @spf13 (Steve Francia in https://www.infoq.com/articles/go-language-13-years/):

Go’s strength is still the cloud/server applications that Go is such a good fit for, ...

The second, much more significant phase, will be the industry shifting to take advantage of the unique cloud offerings, increasingly moving to cloud-native application development. In these cases, Go is the clear choice.

Lots of these cloud native applications will run in container environments and lots of them will need to talk to external APIs, which then will need root certificates to establish the secure connection. There I see a clear benefit in the Go standard library providing a simple way of including the CA root certificates into the Go binary. It is maybe a little bit of a stretch, but if a Go application is run in a container FROM scratch, then the Go application sort of becomes the operating system and operating systems normally provide a set of root certificates.

I clearly understand your concern for you as a member of the security team and I completely agree, that is likely your team, who will end up being responsible for the maintenance. But if we look at this from users point of view, then we are definitively in a better place, when the Go security team takes care of this, than if a random open source contributor publishes a package with this functionality. If this proposal gets accepted, I will happily remove/deprecate my package. With the still rapid growing number of Go developers around the world I feel the maintenance of these root certificates inside of the Go standard library is worth the effort to increase the overall security and trust of the Go and especially the cloud native / containerized eco system.

For the update process as well as the location of such a package, it is my belief, that the package should life in the standard library (and not in x/) for the following reasons:

In regards to the NSS trust, I did not go into full details here, but it is my observation, that this problem has relaxed over the last couple years. On one hand, all the mentioned Linux distributions have solved this problem one way or the other, so this is a pretty well understood problem with battle proved solutions available. On the other hand does Mozilla provide the PEM of Root Certificates in Mozilla's Root Store with the Websites (TLS/SSL) Trust Bit Enabled (see: CA/Included Certificates). In my understanding, this is already the correct list of certificates with the correct trust bits enabled.

The other part you have identified in my proposal (API to set the default root pool) is less relevant for the use cases I am outlining in the proposal, for the following reasons:

  1. On Unix/Linux based systems, Go already today offers the possibility to add additional certificates to the default pool by setting the respective environment variables.
  2. There are other proposals covering somewhat related aspects:
    1. crypto/x509: add ability to reload root certificates #41888
    2. crypto/x509: Convenient, efficient way to reload a certificate pool after on-disk changes: #35887
  3. For me, the way the system roots are loaded is not important. The go:linkname trick has been bluntly copied from time/tzdata, so nothing I invented myself.
rsc commented 2 years ago

How often do root sets change? How quickly would one go stale in an old Go distribution? How many different ones are there? Is there an obvious one that we could pin ourselves to, like we pin to the standard timezone database.

seankhliao commented 2 years ago

Chrome's new initiative to maintain its own root store may be of interest https://www.chromium.org/Home/chromium-security/root-ca-policy/

joeshaw commented 2 years ago

How often do root sets change?

curl maintains a bundle of the Mozilla root certs: https://curl.se/docs/caextract.html

It may not be comprehensive but there were 5 updates in 2021 and 5 so far in 2022.

How many different ones are there?

The big ones are Mozilla, Google, Microsoft and Apple. There may be additional ones.

Is there an obvious one that we could pin ourselves to, like we pin to the standard timezone database.

The Mozilla one is the best candidate here. Their policy is documented here and they have an open process for inclusion. Most open source software and Linux distributions use the Mozilla store.

breml commented 2 years ago

My "proof of concept" (https://github.com/breml/rootcerts), mentioned above, does use the root certificates from Mozilla, similar to the way it is done for curl.

This root set does contain 146 certificates as of now. I quickly analyzed the list from Included CA Certificates (CSV) from https://wiki.mozilla.org/CA/Included_Certificates (with the email only certificates filtered out). The average validity of these certificates is 23.5 years (with a minimum of 9 years and a maximum of 35 years).

So the number of certificates that are needed to be updated will average in the range given by @joeshaw (~6 year).

In contrast to tzdata updates, this seams to be in a similar range (see: http://mm.icann.org/pipermail/tz-announce/):

2022 ytd: 1 2021: 5 2020: 6 2019: 3 2018: 9 2017: 3 2016: 10 2015: 7 2014: 10 2013: 9

rsc commented 2 years ago

This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group

rsc commented 2 years ago

/cc @golang/security

rsc commented 2 years ago

One reason this would be interesting to resolve is that currently "single-Go-binary containers" cannot make HTTPS connections. You have to side-load the root set or derive the container from a full Linux distribution. If we could optionally put the root set into the binary, that would make single-binary containers useful for more programs.

FiloSottile commented 2 years ago

I am conflicted about this.

I agree it makes single-binary deployments more convenient, and it feels like a natural feature to provide.

On the other hand

  1. root stores are moving more and more away from being a list of homogeneous trusted anchors, and towards rich metadata that is only expressed in the code of the verifier they ship with (name constraints, time validity, CT requirements...) which is why we invested so much last year in moving to the platform verifier
  2. root stores are aggressively moving towards higher root agility, meaning that past numbers on the frequency of root changes are not indicative of the next five years, and we'd be committing to maintaining this pretty much forever
  3. distrust changes MUST be shipped as security releases (this is a whole debate in the Linux distribution world, and we don't want to become the next chapter of the problem of slow to update root stores), which will increase the security release load
  4. it's not clear what store we should ship: the canonical answer is the Mozilla one, and it's not a bad answer; the Chrome one might be interesting too, but I'm not sure they would want us to decouple it from the verifier
  5. trust is fundamentally a platform concern (which Windows and macOS and iOS understand) and the correct answer to this is arguably for container runtimes to synthesize an /etc/ssl/roots.pem file the same way the synthesize /etc/resolv.conf

This might make more sense as an auto-updated x/crypto package, which then would encourage adding x509.SetDefaultRoots() but I don't love the semantics of a runtime-changable singleton like that. I can see libraries arbitrarily using it in init() leading to conflicts and unexpected behavior that is hard to debug.

qdm12 commented 2 years ago

From my narrowed github user point of view, having it as a module dependency together with dependabot is nice, so I'm alerted when it gets outdated. If we want it as part of x/crypto it would make tooling such as dependabot not alert us for outdated CAs.

rolandshoemaker commented 2 years ago

I agree with all of the points @FiloSottile makes (unsurprisingly.) This is really a platform problem, and in an ideal world platforms would solve it for us. That said this has been a problem for a while, and as far as I can tell, there has been no real movement towards a real platform fix for containers and such. It is probably better that we provide at least some centralized solution for now, in order to dissuade people from doing the less ideal runtime hacks that are described further up this thread.

I think if we are to go down this path it should absolutely be a separate x/crypto (or perhaps a completely new x/) module, which simply provides a pre-populated x509.CertPool. This would let us decouple any urgent updates from the Go release cadence (and if done outside of x/crypto any other changes people may be less interested in pulling down). Auto-updating the module would also allow us to embed some sort of auto-destruct staleness check, preventing usage of the module if it is too out of date.

Where the list comes from is a complicated question, Mozilla is almost definitely the easiest solution, and has a pretty good history of not being full of terrible stuff. There are some not-chrome-but-adjacent Google folks working on related stuff who it might be worth talking to (I'll need to trawl my inbox to try to remember who exactly to ping.)

I agree that the prospect of adding a SetDefaultRoots type API is not ideal, and will likely result in some people setting roots in places they probably shouldn't, but this can essentially already happen with runtime linking, without a clear API around the behavior of multiple modules attempting to change the default pool. At least with a clear public API we can document the expected behavior, i.e. only allowing a single call to the method, panicking otherwise ¯_(ツ)_/¯.

breml commented 2 years ago

From my narrowed github user point of view, having it as a module dependency together with dependabot is nice, so I'm alerted when it gets outdated. If we want it as part of x/crypto it would make tooling such as dependabot not alert us for outdated CAs.

If I understand this correctly, this is related to the fact, that there are no tagged releases for x/crypto, correct? For me, this raises the following questions:

rsc commented 2 years ago

It sounds like we may want to do something to nudge people's behavior here toward things we'd prefer.

It also sounds like we could add a very tiny hook in crypto/tls (Roland said SetDefaultRoots) which can be called just once (twice is a panic), and then the actual cert sets can be provided out of standard library by x/crypto, which is easier to update. Do I have that general sketch right?

Separately, Filippo said that verifiers are starting to add different kinds of new logic. So is the "pluggable root set" the right approach or is it "pluggable ____" where we still need to fill in the blank? Maybe a verifier function or interface makes more sense than a raw CertPool?

rsc commented 2 years ago

/cc @rolandshoemaker

FiloSottile commented 2 years ago

Yeah, that's a good call, maybe this should be SetDefaultVerifier(func(c *Certificate, opts VerifyOptions) (chains [][]*Certificate, err error)) which reflects better the shape of the default "root set" (actually, the platform verifier API) on non-Linux OSes.

Only ~two~ three concerns:

  1. Packages setting it unconditionally, including on macOS/Windows where it's unnecessary, and getting a strictly worse verifier (less compatible, and less secure). This is mitigated by only making our x/crypto implementation do anything on non-macOS/Windows systems.
  2. If this is called, is there still a way to access the actual default verifier? For example an application might want to extend the default verifier by wrapping it.
  3. This is generally spooky action at a distance and it would be confusing if it caused issues because it can't be detected by reviewing the verification callsite, but on balance it's probably unavoidable and worth it.

I guess if this replaces the default verifier, then it becomes the behavior attached to the SystemCertPool, and SystemCertPool + additional roots becomes "run this and then run a separate verification with the additional roots. I'm ok with it even if SystemCertPool becomes a bit of a misnomer.

rolandshoemaker commented 2 years ago

SetDefaultVerifier is an interesting idea, it would definitely allow creating a verifier that more closely matches the logic of some of the existing root programs (wherein trust decisions are often defined in code, rather than consumable configs). As it is a significantly more powerful API though, it obviously creates significantly more trapdoors to subtly break the certificate verification process (either via malicious action or not.)

I'll pushback a little on this idea though. Most of the additional logic that Mozilla adds on top of their list is restrictions on various borderline roots, where the CA has done something bad but because of wide legacy deployment, they feel the need to maintain trust for it in very specific circumstances. Is this something we need to support? Possibly, but we could take on significantly less complexity by simply trimming these roots out and providing binary trust decisions.

I'm not super strongly opinionated, I tend to lean towards less complexity, but I agree that providing a way of modifying the default system verifier would definitely be extremely powerful, allowing some interesting behaviors down the road.

breml commented 2 years ago

My understanding so far is, that in my initial proposal, one may enable the default roots by simply adding import _ crypto/x509/rootcerts (or providing the build flag -rootcerts), like this:

package main

import (
    _ "crypto/x509/rootcerts"
    "log"
    "net/http"
)

func main() {
    res, err := http.Get("https://www.google.com/robots.txt")
    if err != nil {
        log.Fatal(err)
    }

    log.Infof("%d", res.StatusCode)
}

With SetDefaultRoots, I imagine the code to look like this:

package main

import (
    "crypto/x509"
    "log"
    "net/http"
)

func init() {
    x509.SetDefaultRoots()
}

func main() {
    res, err := http.Get("https://www.google.com/robots.txt")
    if err != nil {
        log.Fatal(err)
    }

    log.Infof("%d", res.StatusCode)
}

I am a little bit lost on the SetDefaultVerifier suggestion. How would this exactly be used?

Additionally, I see the following issue: If we have a very thin hook in crypto/x509 to set the default root certs and the root certs themselves are located in a different package (e.g. in x/crypto), the root certs would always be included into the final binary, as soon as there is a dependency to the crypto/x509 package. Or will the compiler be able to detect, that SetDefaultRoots is not called and therefore the binary data for the root certs is nowhere referenced and can therefore be striped from the final binary? In the original proposal, the decision, if the root certs should be included into the binary is left to the the user.

rolandshoemaker commented 2 years ago

I believe the two suggestions are as follows (after writing this I am pretty convinced SetDefaultRoots is the best option in the short term.)

"SetDefaultRoots"

In crypto/x509 add the following API func SetDefaultRoots(roots *CertPool), which allows setting the default root pool to use when verifying certificate chains, rather than using the default system roots. A new package (or module under x/) would then be created to house the bundled root set (packages which want to use this would then call x509.SetDefaultRoots(rootbundle.Pool) during setup or in an init in order to register the default pool.)

"SetDefaultVerifier"

A more complex API would be introduced in crypto/x509, along the lines of :

  type Verifier interface{
    Verify(input *Certificate, opts x509.VerifyOptions) (chains [][]*Certificate, err error)
  }

  func SetDefaultVerifier(verifier Verifier)

This would allow completely replacing the Go chain verifier with another implementation. This would allow implementation of much more complex root policies (such as restricting certain roots in various ways) by either wrapping the existing Go verifier logic (although, we'd need some way to short circuit calling out to the new verifier, or introducing a direct API call for it, so that it can still be invoked after registering a new verifier), or implementing their own verifier entirely (the standard library could also use this functionality to somewhat more cleanly handle the usage of platform verifiers, rather than having escape hatches in the pure Go verifier.)

FiloSottile commented 2 years ago

Thinking more about it, I agree SetDefaultRoots is the right tradeoff over SetDefaultVerifier for now.

I still wish for a hint against using it if not necessary, such as on Windows and macOS. Maybe it’s just that the x/ package docs show an example that wraps the call in an if?

A more radical alternative would be SetFallbackRoots, which are only used if loading the system roots or invoking the system verifier fails. I like that it doesn’t modify any non-error behavior, but fixes the error scenario. That might result in less unexpected or surprising behavior (like changing a system root and not seeing Go picking it up, because some package is using SetDefaultRoots).

Were there compelling use cases for SetDefaultRoots besides filling in for missing system roots?

breml commented 2 years ago

Were there compelling use cases for SetDefaultRoots besides filling in for missing system roots?

@FiloSottile My very initial use case for this proposal was not about filling in for missing system roots but replacing system roots, that have no longer met my requirements. Specifically, the root certs available on my SOHO NAS did no longer work with my cloud storage provider of choice and therefore rclone refused to perform my off site backup (see also https://breml.github.io/blog/2021/01/17/embed-ca-root-certificates-in-go-programs/). That being said, I don't now if this matches your definition of "compelling".

This brings me back to one aspect of the initial proposal, that is not covered by SetDefaultRoots (or SetFallBackRoots). For the above mentioned use case, it would be beneficial to have support for a build tag rootcerts (similar to timetzdata), that allows to build an existing code base (e.g. rclone) with the root certs included without the need of changing the source code or creating a fork.

FiloSottile commented 2 years ago

@breml Oh! I think your use case is satisfied by the SSL_CERT_FILE and SSL_CERT_DIR environment variables. Sorry for not realizing sooner 😅

Those fill the "operator replaces system roots without recompiling" need. The gap I see is "self-contained binary embeds roots to run on systems that don't provide them". Unless I am missing other use cases, SetFallbackRoots might be a better solution.

breml commented 2 years ago

@breml Oh! I think your use case is satisfied by the SSL_CERT_FILE and SSL_CERT_DIR environment variables. Sorry for not realizing sooner sweat_smile

I am well aware of SSL_CERT_FILE and SSL_CERT_DIR, both are mentioned in the initial proposal. Also both use cases (containers and poorly maintained systems) are mentioned in the initial proposal in the 1st and 2nd paragraph. The problem with the legacy system, which I have initial solved with the before mentioned environment variables, was just the trigger to investigate this problem further. Pretty quick I realized, that this would be indeed very helpful for "self-contained binaries" in e.g. containers FROM scratch. Only then (and after having a working PoC) I decided to file this proposal.

I am not sure, if your statement is meant to be sarcastic or not. But if so, I have a hard time to take it.

FiloSottile commented 2 years ago

@breml Sorry, this was a blunder on my part, there was a long-running discussion also before the proposal about this and I lost context. I probably should have re-read the proposal before commenting. Disregard that part of the comment, please.

Still, I feel like if we focus on the problems/use-cases, and remove those handled by SSL_CERT_FILE and SSL_CERT_DIR, what's left might be sufficiently covered by SetFallbackRoots.

rolandshoemaker commented 2 years ago

Were there compelling use cases for SetDefaultRoots besides filling in for missing system roots?

Not that I can think of. There are definitely cases where people may want to set a specific set of roots to used when a root pool is not provided, by as you mention above, those are already satisfied by SSL_CERT_FILE and SSL_CERT_DIR. SetFallbackRoots sounds like a good way to generally avoid the footgun of registering things when you really shouldn't be, I would support it over SetDefaultRoots.

FiloSottile commented 2 years ago

With SetFallbackRoots semantics, I would also support the x/ package automatically registering the pool as a side-effect of being imported, as it's unlikely applications will want to apply conditions to it.

rolandshoemaker commented 2 years ago

With SetFallbackRoots semantics, I would also support the x/ package automatically registering the pool as a side-effect of being imported, as it's unlikely applications will want to apply conditions to it.

Yeah that seems reasonable to me.

rolandshoemaker commented 2 years ago

Just to tie things up, it sounds like the consensus is now:

breml commented 2 years ago

I like the general direction this is taking. The fallback behavior is also in line with the time/tzdata behavior which is described as (source: https://pkg.go.dev/time/tzdata@go1.19.1):

Package tzdata provides an embedded copy of the timezone database. If this package is imported anywhere in the program, then if the time package cannot find tzdata files on the system, it will use this embedded information.

I have the following questions though:

  1. Why do we need to panic, if x509.SetFallbackRoots is called multiple times? For time/tzdata this also not the case. We can put the same disclaimer in the description of the new package (as well as SetFallbackRoots) as in time/tzdata:

    This package should normally be imported by a program's main package, not by a library. Libraries normally shouldn't decide whether to include the timezone database in a program.

  2. Would we put the init code in the new module (most likely x/...) in sync.Once, such that at least importing this new package multiple times would not panic? (I can not see the advantage of a panic in this case, because the certificates that are loaded as fallback are the same).
  3. SSL_CERT_FILE and SSL_CERT_DIR have already been mentioned at length. For the case, where a binary is compiled with the fallback roots included, but crypto/x509 is able to load system roots (which might be outdated), I would still wish for a way to force the usage of the included fallback roots, e.g. with a SSL_CERT_FORCE_FALLBACK environment variable.
breml commented 2 years ago

@stapelberg Your project github.com/gokrazy/tools does use github.com/breml/rootcerts in a "non-standard" way by using the embedded root certs directly, so you might want to chime in on this issue and verify, if your use case would be resolved by the current proposal as well.

stapelberg commented 2 years ago

Thanks for the ping.

Indeed it seems like my use-case (the gokrazy Go appliance platform) is not covered by the current proposal:

  • Add a new module somewhere which contains a set of roots (probably the NSS list) that when imported calls x509.SetFallbackRoots in a init statement, registering those roots as the fallback.

Could this new module also expose the roots programmatically, and not just register them in the current process?

In my case, the gokrazy packer (a Go program) creates a self-contained root file system image to be run (with the Linux kernel) on a Raspberry Pi (or similar) that only contains other Go programs.

The gokrazy packer can easily be run on Linux, where we just copy the system root file into the resulting image. But on macOS and Windows, we don’t have a system root file that we can copy. That’s where we currently use github.com/breml/rootcerts.

Ideally, we would programmatically create a roots file (at “gokrazy packer time”) that the Go runtime would then load (at “Raspberry Pi run time”).

FiloSottile commented 2 years ago

Thanks @breml and @stapelberg, getting feedback on use-cases is always useful!

Ideally, we would programmatically create a roots file (at “gokrazy packer time”) that the Go runtime would then load (at “Raspberry Pi run time”).

Ok, if I follow correctly, you need (1) a source of roots and (2) a way to feed them to the runtime.

For (1) I am thinking that using this x/crypto package would just be a convoluted way of fetching the Mozilla roots, which you could do directly and cut the middleman (with its lag) out. Although, maybe since we're already doing that work we can also just expose the []*x509.Certificate slice.

For (2) either SSL_CERT_FILE or the proposed x509.SetFallbackRoots work, right?

rolandshoemaker commented 2 years ago

Another option for (1): we're going to need to build some tooling that consumes the NSS list (or other TBD source) and generates Go code. We could just stick that in a package in the module for external consumption 🤷.

stapelberg commented 2 years ago

Ok, if I follow correctly, you need (1) a source of roots and (2) a way to feed them to the runtime.

For (1) I am thinking that using this x/crypto package would just be a convoluted way of fetching the Mozilla roots, which you could do directly and cut the middleman (with its lag) out. Although, maybe since we're already doing that work we can also just expose the []*x509.Certificate slice.

Yep, exposing the x509.Certificate slice would work for me. I don’t mind the lag :)

For (2) either SSL_CERT_FILE or the proposed x509.SetFallbackRoots work, right?

Yes. Those might not even be necessary for my particular case — I can just write a .pem file in the right place in the resulting root file system, and the runtime will already find them.

breml commented 2 years ago

If "... expose the []*x509.Certificate slice." means, that we have an exported variable of this type, then I would rather prefer to have an exported function, that returns this slice, such that the variable holding the certificates is kept private. Similar to the way it is done in https://github.com/breml/rootcerts/blob/master/embedded/embedded.go#L10.

Additionally, what are your thoughts about my questions in https://github.com/golang/go/issues/43958#issuecomment-1256879602 ?

rsc commented 2 years ago

Are we converging to any specific API?

breml commented 2 years ago

The whole topic can be split in 3 parts:

  1. Provide a facility to add fallback roots.
  2. Provide a properly versioned package, which contains an embed-able set of root certificates from a trustworthy source.
  3. Allow to force the usage of the fallback roots (over the system roots).

It is my understanding, that we have agreement on the following aspects:

Proposed without any objection so far:

In my opinion we have not yet consensus on the following questions:

rolandshoemaker commented 2 years ago

Thanks for the summary @breml.

Summarizing the API proposal, and answering a few of the open questions:

crypto/x509

I think myself and @FiloSottile are in agreement that this should fail if called more than once, either via panic, or by returning an error. The rational for this is that this is generally somewhat complex and subtle behavior, and is likely to be somewhat difficult to debug failures that it would cause in cases where importing a package attempts to register a specific set of roots. Failing in this case makes the behavior much more explicit, and hard to mess up. Additionally there is little reason for why a user would want to attempt to register fallbacks multiple times, in general it just seems like a foot gun. I think panic vs. error is an open question though.

// SetFallbackRoots sets the roots to use during certificate verification,
// if a system certificate pool is not available (for instance in a container
// which does not have a root certificate bundle). SetFallbackRoots may only
// be called once, if called multiple times, it will panic.
func SetFallbackRoots(roots *CertPool)

-- or --

// SetFallbackRoots sets the roots to use during certificate verification,
// if a system certificate pool is not available (for instance in a container
// which does not have a root certificate bundle). SetFallbackRoots may only
// be called once, if called multiple times, it will return an error.
func SetFallbackRoots(roots *CertPool) error

This can be implemented separately from the following changes, and I'd like to target 1.20 for this.

New module x/nssroots or x/rootcertbundle (or something)

Introduce a new module which exposes no API, and simply calls x509.SetFallbackRoots in its init statement on an unexposed, generated x509.RootPool. I think the name is still up for consideration (I believe we are all pretty much on the same page about using the NSS trust store, but the name should probably include this, in case we want to provide another different trust list in the future.)

The package that registers the fallback roots would contain a date at which it was generated, if the date is more than N days/months old (N tbd) init would panic so that users do not end up relying on old out-of-date trust stores. Similarly, if x509.SetFallbackRoots returned an error itself, rather than panicking, if called more than once, init would also panic here.

The generated pool would be generated by a tool in a sub-package which exports the following API:

// FetchOptions specifies options for fetching the trust store
type FetchOptions struct {
  client *http.Client
}

// FetchNSSRoots fetches the NSS root certificate trust store, and parses it into
// a list of *x509.Certificates.
func FetchNSSRoots(opts FetchOptions) ([]*x509.Certificate, error)

This tool would be called by a generate script. We would automate an update process which refreshes the trust list every N days/months (N tbd.) Basically just a cron that runs the generate tool, bumping the generation date, and sends a CL. The automated tagging process would then bump the module version.

misc

Can we add a way to force the usage of the embedded roots e.g. with an env var. (see https://github.com/golang/go/issues/43958#issuecomment-1256879602).

I think this is already satisfied with the suggested API, and the existing behavior of SSL_CERT_FILE. Setting SSL_CERT_FILE=/dev/null and calling SetFallbackRoots, while not necessarily the prettiest solution, essentially solves this, forcing the fallback to always engage.

breml commented 2 years ago

Thanks for the explanations @rolandshoemaker.

crypto/x509

... I think panic vs. error is an open question though.

OK, I get it. In this case, I prefer SetFallbackRoots to panic.

This can be implemented separately from the following changes, and I'd like to target 1.20 for this.

To have this in 1.20 would be great.

New module x/nssroots or x/rootcertbundle (or something)

Introduce a new module which exposes no API, and simply calls x509.SetFallbackRoots in its init statement on an unexposed, generated x509.RootPool.

In this case, I do not agree for the following reasons:

  1. We already discussed the use case by @stapelberg, who would like to access the root pool.
  2. I think the choice, if the root pool is still considered "recent enough" needs to be with the user of this package.
  3. I can think of situations, where users of this package would like to modify the pool of root certs before registering them as fallback (consider a CLI API client for one particular API, in this case it would be enough, if the root certs needed for this particular API are registered).

I think the name is still up for consideration (I believe we are all pretty much on the same page about using the NSS trust store, but the name should probably include this, in case we want to provide another different trust list in the future.)

I do not have a strong opinion about the name. But I think it is a smart move to have this as a separate module, because it will allow us to have it properly versioned and those work with tools like dependabot.

The package that registers the fallback roots would contain a date at which it was generated, if the date is more than N days/months old (N tbd) init would panic so that users do not end up relying on old out-of-date trust stores. Similarly, if x509.SetFallbackRoots returned an error itself, rather than panicking, if called more than once, init would also panic here.

In my opinion, it is way too intrusive to add an "EOL" date to this package, which will cause the applications to panic if the EOL date is passed. This decision in my opinion should remain with the user of this package.

I propose an API along those lines:

const rootPool = `<filled by the generator tool>`

// Certificates returns the embedded certificates as a slice of x509.Certificates.
func Certificates() []x509.Certificate

// generatedTime is filled with the time when the generator tool has updated
// this package.
const generatedTime = ``

// GeneratedTime returns the time, when the root certificates embedded in this
// package have been updated.
func GeneratedTime() date.Time

An alternative function signature to return the certs could be func CertificatesAsPEM() string.

Obviously returning a x509.CertPool does not really help here, because the x509.CertPool does not expose the necessary information.

The generated pool would be generated by a tool in a sub-package which exports the following API

I would prefer to not export the API of the generator tool and keep it internal as an implementation detail.

This tool would be called by a generate script. We would automate an update process which refreshes the trust list every N days/months (N tbd.) Basically just a cron that runs the generate tool, bumping the generation date, and sends a CL. The automated tagging process would then bump the module version.

👍🏻

misc

I think this is already satisfied with the suggested API, and the existing behavior of SSL_CERT_FILE. Setting SSL_CERT_FILE=/dev/null and calling SetFallbackRoots, while not necessarily the prettiest solution, essentially solves this, forcing the fallback to always engage.

Thanks for this workaround, I think I can life with this. Just for the record, one needs to set both, SSL_CERT_FILE and SSL_CERT_DIR to /dev/null for this to work. Only setting SSL_CERT_FILE was not enough on my Linux machine.

stapelberg commented 2 years ago
  • I think the choice, if the root pool is still considered "recent enough" needs to be with the user of this package.

Strong +1 on this. If the new package self-destructs after a certain time without updates, that makes it so unattractive to me that I can’t use it :(

rolandshoemaker commented 2 years ago

Before getting to the other comments, I just wanted to pull this one out first because I realize I've only had conversations about this out-of-band, and not in this thread, so I want to add some additional context to my thought process here.

In my opinion, it is way too intrusive to add an "EOL" date to this package, which will cause the applications to panic if the EOL date is passed. This decision in my opinion should remain with the user of this package.

The web PKI ecosystem has been plagued in the past (and the present) by stale trust stores, which due to the inclusion of expired, no longer trusted, and compromised roots, and the exclusion of new, otherwise trusted, roots, has added significant pain and complexity to both clients/servers and CAs.

Adding what is essentially a new trust store to the ecosystem (regardless of the fact it is just a re-packaged version of an existing store) which continues to perpetuate these issues, by allowing users to simply pull in a specific version of a module and then completely forget to ever update it, resulting in an out-of-date trust store baked into projects/binaries, I think does a disservice to the web PKI.

I don't necessarily think providing the generation date and allowing users make that decision for themselves really solves this issue, as we'd be creating an API which contains a massive footgun by default. The complexities of the web PKI are not obvious to many users, and one of the goals we've had with the various crypto/x509 APIs is trying to make them as safe as possible for the majority, even if that removes some of the more complex functionality it could have. I suspect most users of this package would simply import it/register the root pool and be done with it. If we believe the majority of users should be checking for the staleness of the bundled roots before registering them (which I personally do), I think that should be default behavior.

We could add an escape valve for advanced users (be that a env var or something else), which would allow bypassing the staleness checks so an old set of bundled roots could be used, but I also want to point out that as described the API does let you do this dangerous behavior already. By providing a way to fetch and/or parse the NSS (or whatever) list separately from the main bundle module, and the SetFallbackRoots API, you can register whatever set of roots you want, regardless of how old they are. The main module should, in my opinion, be the simplest, most straight forward solution for the majority of users, but as designed the API doesn't prevent more advanced users from doing whatever they like.

And now for the other points:

  1. We already discussed the use case by @stapelberg, who would like to access the root pool. ...
  2. I think the choice, if the root pool is still considered "recent enough" needs to be with the user of this package. I can think of situations, where users of this package would like to modify the pool of root certs before registering them as fallback (consider a CLI API client for one particular API, in this case it would be enough, if the root certs needed for this particular API are registered).

The root pool itself is already accessible in two ways with the API described, one by simply generating it using the exposed API for generating the bundled root set, and the other via x509.SystemCertPool which would return the default root pool, which if the fallback code path triggered would contain the pool embedded in the package.

Similarly users who wish to extend the root pool can do so in the same way they'd do currently extend the system cert pool, by mutating the pool returned from x509.SystemCertPool and passing that to x509.Certificate.Verify.

Thanks for this workaround, I think I can life with this. Just for the record, one needs to set both, SSL_CERT_FILE and SSL_CERT_DIR to /dev/null for this to work. Only setting SSL_CERT_FILE was not enough on my Linux machine.

Ah yup good point. Another option would be to add a new env var, e.g. SSL_ROOT_FALLBACK, which always forces the fallback, and might be slightly more obvious.

stapelberg commented 2 years ago

I think that should be default behavior.

We could add an escape valve for advanced users (be that a env var or something else),

As long as there’s an escape valve, I don’t mind the suggested default behavior.

The root pool itself is already accessible in two ways with the API described, one by simply generating it using the exposed API for generating the bundled root set, and the other via x509.SystemCertPool which would return the default root pool, which if the fallback code path triggered would contain the pool embedded in the package.

I want to point out that x509.SystemCertPool is not sufficient for my use-case. As I described in more detail in https://github.com/golang/go/issues/43958#issuecomment-1261205360, for my case, I need to write the certificates to disk (so that another Go process can at a later point, and on a different device, read and use them). The x509.CertPool type only allows usage, but not exporting.

FiloSottile commented 2 years ago

I fully agree on the risk to the WebPKI of stale root stores. However, I don't think we can justify making applications spontaneously break overnight. The idea is that fallback roots will be what makes the difference between an application that works and one that doesn't, so the scenario would be: the application is tested extensively, everything works, it gets deployed to production, forgotten about because it works fine for a while, and then suddenly stops working. This will ring pagers and take services down. That's not ok.

What I think we should do is what we've been advocating for with the Linux distributions, which repackage the Mozilla store like we would: treat every single root store update as a security update, put it into the vulndb, and let govulncheck flag it both in development and in production. That's the right flow: the root store gets updated, a flag gets raised, and SREs update and redeploy the application, without downtime.

FiloSottile commented 2 years ago

About exposing the pool, it's my understanding that there are two use cases:

  1. I want to have roots embedded for use in this application, so it will work without system roots.
    • This is the original use case of empty containers.
  2. I want access to a set of roots to embed into something else.
    • This is @stapelberg's use case.

I'll point out that they are only tangentially related.

One way to solve (2) is to make the pool we embed for (1) exportable, but I like better the idea of exporting the generator that produces the pool in (1). That's because the generator does not get stale as quickly as the pool itself, so @stapelberg won't need to stay up to date with the new module versions as religiously, and will still get an updated pool by running the generator.

If we expose the pool itself, we are doubling the latency: when the pool gets updated the generating application needs to be updated, and then re-run to produce a new embedded set. If we expose the generator, the application just needs to be re-run.

FiloSottile commented 2 years ago

Summarizing, here's a color I like for the bikeshed.

package x509 // crypto/x509

// SetFallbackRoots sets the roots to use during certificate verification, if no
// custom roots are specified and a platform verifier or a system certificate
// pool is not available (for instance in a container which does not have a root
// certificate bundle).
//
// SetFallbackRoots may only be called once, if called multiple times, it will
// panic. Most users should not call SetFallbackRoots directly, and instead
// import a recent version of the golang.org/x/rootcerts/bundle package.
func SetFallbackRoots(roots *CertPool)
// Package rootcerts provides the tools to fetch a X.509 root store pool.
//
// Most users should use golang.org/x/rootcerts/bundle instead.
package rootcerts // golang.org/x/rootcerts

// FetchOptions specifies options for fetching the trust store.
type FetchOptions struct {
    // HTTPClient is the client used to download the latest roots. If nil, a
    // default client with sensible timeouts is used.
    HTTPClient *http.Client
}

// FetchRoots fetches the root certificate trust store, and parses it into
// a list of *x509.Certificates.
func FetchRoots(opts FetchOptions) ([]*x509.Certificate, error)

// FetchRootsAsPEM fetches the root certificate trust store, and produces it as
// a list of concatenated PEM entries.
func FetchRootsAsPEM(opts FetchOptions) ([]byte, error)
// Package bundle embeds a set of fallback X.509 trusted roots in the
// application by automatically invoking [x509.SetFallbackRoots]. This allows
// the application to work correctly even if the operating system does not
// provide a verifier or system roots pool.
//
// To use it, import the package like
//
//     import _ "golang.org/x/rootcerts/bundle"
//
// It's recommended that only binaries, and not libraries, import this package.
//
// This package must be kept up to date for security and compatibility reasons.
// Use govulncheck to be notified of when new versions of the package are
// available.
package bundle // golang.org/x/rootcerts/bundle

I dropped NSS from the names because I think we should allow ourselves to change the source of the roots later.

I considered adding a GeneratedTime to package bundle, but if there are no changes it's actually ok for the bundle to be a little older, and I couldn't come up with a "correct" recommendation for how old it should be. I considered a IsUpToDate function, but concluded that this is really govulncheck's job.

Picked panic for SetFallbackRoots reuse, but I don't have a strong opinion. SetFallbackRoots is a low-level API anyway, my hope is that basically no one will use it except golang.org/x/rootcerts/bundle.

rolandshoemaker commented 2 years ago

I fully agree on the risk to the WebPKI of stale root stores. However, I don't think we can justify making applications spontaneously break overnight. The idea is that fallback roots will be what makes the difference between an application that works and one that doesn't, so the scenario would be: the application is tested extensively, everything works, it gets deployed to production, forgotten about because it works fine for a while, and then suddenly stops working. This will ring pagers and take services down. That's not ok.

What I think we should do is what we've been advocating for with the Linux distributions, which repackage the Mozilla store like we would: treat every single root store update as a security update, put it into the vulndb, and let govulncheck flag it both in development and in production. That's the right flow: the root store gets updated, a flag gets raised, and SREs update and redeploy the application, without downtime.

I think treating store updates as security updates solves a large portion of the issues I have here, but I think it misses one large issue which you hint at. govulncheck (and dependabot etc) are able to surface issues with stale stores for library/tooling maintainers, but a big use case for this is going to be set-and-forget binaries which once deployed are unlikely to be scanned (although govulncheck does provide this functionality, if people are going to be routinely doing this with deployed binaries, which may be packaged divorced from the Go toolchain, seems improbable). Perhaps we can convince maintainers to issue their own security releases when trust store updates happen, and this will trickle down to various packaging ecosystems, but there is some circumstantial evidence that this doesn't happen with other existing ecosystems already (i.e. Java).

What I'd like to avoid as much as possible is to avoid some large deployment of \<insert popular packaged thing> becoming another contributor to the considerations everyone in the web PKI has to make about what is trusted by who and why in the same way we have to consider the deployed LTS versions of linux distros are, or the versions of JVM, etc etc etc.

Perhaps this is not really a problem we can solve, but I think it's something we should really think about before giving up on it. I think having a short lifespan for the package (if we did implement some kind of self-destruct) would be obviously counter productive, but similarly if the package is deployed in a stale state for a long period of time it's likely to lead to its own odd failure modes (in particular starting to encounter more and more untrusted certificates because of a lack of new roots) which will cause their own, confusing pages.