thought-machine / please

High-performance extensible build system for reproducible multi-language builds.
https://please.build
Apache License 2.0
2.44k stars 205 forks source link

SLSA and Provenance attestation data for builds #3121

Open matglas opened 3 months ago

matglas commented 3 months ago

I am using please build and would love to see some provenance attestation data being output when I run the build. Lately the requests in open source for provenance data is growing and also in the business environment there is need for it. There are great examples based on in-toto attestation and for example the SLSA Provenance predicate that is part of the SLSA Framework (https://slsa.dev) would be very usefull.

I believe that please.build could generate this data very well and output the in-toto statement. Another tool could be used to sign the statement for validation.

Would there be interest in it if I do some PoC work on something like this?

peterebden commented 3 months ago

Yes, there would definitely be interest! We've talked about doing more in this space internally (but obviously haven't actually gotten to it yet) - I think it would very much be useful, and (as you point out) it is something Please should have all the information to be able to produce.

matglas commented 3 months ago

I will try to add some context and some references here related to provenance so its possible to come to an implementation plan. And for others to learn about it later.

The Supply-chain Levels for Software Artifacts, or SLSA ("salsa") framework is one of the more generic security frameworks that has been developed and embraced over the last few years. It aims to implement "standards and controls to prevent tampering, improve integrity, and secure packages and infrastructure." The framework has there levels in which you can mature as an software producer.

A few core principles are:

More details can be read on the https://slsa.dev website.

Many of the requirements of the SLSA framework are already covered implicitly by Please Build. On major thing that ties all the parts together in the SLSA framework is the use of Attestations. An Attestation is a statement of proof of a thing. In this case an output artifact. And there is a lot that Please Build can prove.

To get a better view of what an Attestation looks like take a look at this Attestation model. https://slsa.dev/attestation-model#model-and-terminology

A Provenance Attestation (https://slsa.dev/spec/v1.0/provenance) is an Attestation that will prove that some artifacts (plz-out/{bin,gen} files) are created by Please Build while executing targets.

An example of such an Provenance Attestation could be like this for Please Build.

{
    // This is predefined
    "_type": "https://in-toto.io/Statement/v1",

    // This is predefined
    "predicateType": "https://slsa.dev/provenance/v1",

    // This follows a schema.
    "predicate": {

        "buildDefinition": {
            "buildType": "https://please.build/buildtypes/run/v1",

            // Maybe put the command arguments in here.
            "externalParameters": {
                "profile": [".plzconfig.ci", ".plzconfig.local"],
                "target": "//foo:bar",
                "include": ["baz"],
                "exclude": ["pop"]
            },

            // Maybe put the final config in here.
            "internalParameters": {
                "version": "v17.8.5",
                "buildConfig": {
                    "build_id": "123456",
                },
                "buildEnv": {
                    "DOO": "tee"
                },
                "plugins": [{
                    "shell": {
                        "uri": "git+https://github.com/please-build/shell.git",
                        "digest": {
                           "gitCommit": "c27d339ee6075c1f744c5d4b200f7901aad2c369"
                        }
                    }}
                ]
            },
            "resolvedDependencies": [
                {
                    "uri": "git+https://github.com/octocat/hello-world@refs/heads/main",
                    "digest": {
                        "gitCommit": "c27d339ee6075c1f744c5d4b200f7901aad2c369"
                    }
                },
                {
                    "uri": "https://github.com/actions/virtual-environments/releases/tag/ubuntu20/20220515.1"
                }
            ]
        },
        "runDetails": {
            "builder": {
                "id": "https://please.build/slsa-framework/slsa-level-1@refs/tags/v0.0.1"
            },

            // Optional. Maybe set the invocationId with a command argument.
            "metadata": {
                "invocationId": "https://ci.example.com/job/1",
                "startedOn": "2023-01-01T12:34:56Z",
                "finishedOn": "2023-01-01T12:44:56Z"
            }
        }
    },
    "subject": [
        {
            "name": "file://plz-out/gen/foo/bar.txt",
            "digest": {
                "sha256": "fe4fe40ac7250263c5dbe1cf3138912f3f416140aa248637a60d65fe22c47da4"
            }
        }
    ]
}
matglas commented 2 months ago

I'm working on an initial implementation draft that collects all kinds of information that belongs inside a Provenance Attestation. This is based on the state and config data at the end of the run command. You can see it in the PR mentioned which is in draft at this point.