Add provenance to Step model and publish to Grafeas

mattmoor commented 6 years ago

We should incorporate a story for tracking provenance into the Build CRD's model.

Within the current model (w/o further restriction), any step can fetch additional inputs or publish outputs.

We need a format in which steps can surface this information, a mechanism by which they do (volume?, termination message?), and a mechanism by which this information is published somewhere (Grafeas?).

imjasonh commented 6 years ago

Today in GCB a build can specify images which we'll attempt to push after steps have finished. Since we do the push, we also scrape the digest reported by the registry. We record that in the Build message in the BuildResults, and we also record the ID of the build that pushed that image in a Grafeas-like container metadata service. So you can either look up images+digests pushed by some build, or the build that pushed this image+digest.

We also fetch source before we execute the steps, and record exactly what Git SHA or Cloud Storage object generation was fetched, in sourceProvenance. So a user could go from an image, to the build that built it, to the exact revision that was fetched by the build.

This only works in GCB when the build specifies source and images, though. If a step pushes an image itself (as FTL does, and bazel's docker_push does, or any step that executes a docker push itself does), we lose that connection.

So for GCB we're exploring a model where steps can report what they did. The plan is, we'll mount a volume to each step's container at /builder/outputs/ and look for files the builder writes there describing images it pushed, in a form we expect:

{ "image": "gcr.io/foo/bar:tag", "digest": "sha256:deadbeefcafe" }
...possibly more entries here...

This format is specifically for pushed Docker images, since that's the one we most urgently need, but one could imagine a similar format for:

GCS objects the step uploaded or fetched
Git repo+revision the step fetched or pushed
HTTP URLs the step fetched, and the downloaded objects' digests.

We could augment our official builder images for git, gsutil, wget, etc., to write this information in the expected format.

This lets steps contribute more visibility into what was brought into and pushed out of a build during its execution, which can then play into runtime policies that enforce that a cluster can only run images built from source that's been code-reviewed, or only from certain branches, or only from objects uploaded by a release bot, etc.

To support this kind of thing in the CRD, we could export this model (if it sounds good), and inject a container after each step container that checks for output files in the expected per-step volume and reports it to Grafeas.

There are still some things we need to iron out with this plan, and any feedback is very very welcome, but if this looks promising in GCB, I see no reason why we can't bring it out to the CRD too.

knative-housekeeping-robot commented 5 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. If this issue is safe to close now please do so with /close. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

knative / build

Add provenance to Step model and publish to Grafeas #11