sigstore / cosign

Code signing and transparency for containers and binaries
Apache License 2.0
4.36k stars 531 forks source link

`cosign` as a library #666

Closed mattmoor closed 2 years ago

mattmoor commented 2 years ago

Description

tl;dr I propose the creation of a well-curated set of libraries for signing, that compose extremely well with and complement github.com/google/go-containerregistry for signing OCI images and indices.

cosign has gained considerable traction as a mechanism for signing OCI artifacts as a standalone executable. I believe that to kick this into an even higher-gear, we want to encourage ecosystem tooling to natively integrate cosign as a library to naturally extend image production to include signing.

cosign (along with a growing list of other tooling) builds around github.com/google/go-containerregistry (aka "GGCR") for interacting with OCI images and indices, so this isn't a huge reach, it is mostly refactoring the code we have today to better align with the way we do things in GGCR so that it is trivially simple for folks to adopt signing, if they've adopted GGCR.


One of the directions I'd like to see us push the library portion of this is to align with the "philosophy" of GGCR, in part to enable a wider variety of use cases.

For example, today cosign is very oriented around registry-based images. Despite it's name GGCR actually allows interacting with images in a wide variety of formats, including things like docker save-tarballs, OCI-layouts, and more.

If cosign were to adhere to some of the principles of GGCR, we might see signing take the form of (v1.Image, Options) -> v1.Image, where the input image is what we want signed and the output image is the signature payload we want recorded somewhere.

With this separation of the signing from the storage, consumers can then leverage a variety of mechanisms to publish the resulting image in different ways. They can choose to remote.Write that image to a registry following cosign's naming convention, or they can tarball.Write.


GGCR is all about enabling cool new things thru composition, and so this is very much about making sure cosign becomes a tool compatible with that ecosystem of composition!

I am still wrapping my head around everything that cosign does, so this picture is starting to form in my head, but if folks are receptive to this (or better willing to help) then I'd love to discuss this direction.

cc @jonjohnsonjr @imjasonh @dekkagaijin @dlorenc

dlorenc commented 2 years ago

@jonjohnsonjr who could have seen this coming?

mattmoor commented 2 years ago

Only someone with vision 😏

dekkagaijin commented 2 years ago

Gonna add another +1 here to the idea of composition :p

samj1912 commented 2 years ago

+1 This would help a lot with integrations into the Buildpacks ecosystem. Our lifecycle and various platforms almost all use/heavily rely on GGCR. cc: @natalieparellano @matthewmcnew @jromero

cpanato commented 2 years ago

+1

mattmoor commented 2 years ago

I'm gonna start small. I am staging a PR to create ./internal/oci/empty to host our own flavor of empty.Image().

I think we should use internal/ to keep folks from adopting this prematurely, and let us move quickly if we want to change things (possibly dramatically). Once we're happy, we can move this out and tread more carefully.

mattmoor commented 2 years ago

I think the next step (for tomorrow) will be to set up a ./internal/oci/signed.Image, which constructs the signature artifact itself.

The way I'm thinking about it now, it should take functional options for a few things, like:

There may be other things as well, but the less variable/optional things would likely just be passed directly.

I'm hoping this will let us reduce UploadSignature to setting up for signed.Image and then feeding that to remote.Write 🤞

imjasonh commented 2 years ago

One naming question: a signed.Image sounds like it would be the original image, plus a Signature() (v1.Image, error) method to get the signed image's signature. Is that what you're intending?

Since signatures are actually just images, you could have a type Signature v1.Image* and a func Sign(v1.Image, Config) (Signature, error), to make it clearer when passing around the signature that it's the signature and not the signed-thing. This would let you have methods that take a Signature and not a v1.Image, for example.

*type Signature v1.Image could also be a struct or interface that embeds v1.Image to add more signature-specific functionality.

mattmoor commented 2 years ago

Yeah, totally +1. I had the same thought as I walked up to bed and made a mental note to change this to ‘signature.Image’, since it is-a signature, where ‘signed.Image’ sounds like it has-a signature.

mattmoor commented 2 years ago

I do like the idea though of extending the Image interface 🤔

Perhaps we have: signature.Image extend v1.Image with accessors for signature related fields and abstract the layer walking etc. Same for attestation.Image.

I also wonder about defining an actual signed.Image that wraps v1.Image and allows folks to associate signature and attestation images. We could have a mutate package to facilitate this.

Then we start to think about things like remote.Get and remote.Write, which could be used to publish a fully signed image with attestations 🤩

I think all of these would be relatively thin shims around ggcr.

imjasonh commented 2 years ago

signature.Image feels like a misnomer to me -- sure, it's technically stored as an image, but consumers of the package shouldn't have to care about that detail. To them, it's a "signature", and maybe there's some escape hatch to get at its image bits (layers, etc.), but mostly we hope they don't have to.

I also wonder about defining an actual signed.Image that wraps v1.Image and allows folks to associate signature and attestation images. We could have a mutate package to facilitate this.

signed.Image could embed v1.Image and add Signature() (cosign.Signature, error), but then you have to figure out how to handle that if the wrapped v1.Image isn't a remote.Image. You'd need some generalized way to associate things, regardless of remote-ness.

That being said: whatever you do, as long as it's in internal, go nuts. 😆

mattmoor commented 2 years ago

@imjasonh I think it'd be fair for the Signature stuff to be an alias of v1.Image, but not be called Image, this would also let us iterate with some of these highly correlated things within a single package 😅

That being said: whatever you do, as long as it's in internal, go nuts

Yeah, I wanted the flexibility to put something together, let it marinate and then hate it and tear it up for something better 🤣

mattmoor commented 2 years ago

Ok, so I spent some time thinking about this on my whiteboard, and I'm going to try and articulate things here, and then start breaking off pieces.

I think that we want an aggregation type for reasoning about images, their signatures and attestations that extends v1.Image:

// SignedImage represents an OCI Image, complemented with accessors
// for retrieving signed metadata associated with that image.
type SignedImage interface{
   v1.Image

   // Signatures returns the set of signatures currently associated with this
   // image, or the empty equivalent if none are found.
   Signatures() (Signatures, error)

   // Attestations returns the set of attestations currently associated with this
   // image, or the empty equivalent if none are found.
   Attestations() (Attestations, error)
}

// Signatures represents a set of signatures that are associated with a particular
// v1.Image.
type Signatures interface{
   v1.Image // The low-level representation of the signatures

  // TODO: Accessors that build on `v1.Image` to provide higher-level accessors
  // for the signature data that is embedded in the wrapped `v1.Image`
}

// Attestations represents a set of attestations that are associated with a particular
// v1.Image.
type Attestations interface{
   v1.Image // The low-level representation of the attestations

  // TODO: Accessors that build on `v1.Image` to provide higher-level accessors
  // for the attestation data that is embedded in the wrapped `v1.Image`
}

// SignedIndex represents an OCI ImageIndex, complemented with accessors
// for retrieving signed metadata associated with that ImageIndex.
type SignedImageIndex interface{
   v1.ImageIndex

   // SignedImage is the same as Image, but provides accessors for the nested
   // image's signed metadata.
   SignedImage(v1.Hash) (SignedImage, error)

   // SignedImageIndex is the same as ImageIndex, but provides accessors for
   // the nested image index's signed metadata.
   SignedImageIndex(v1.Hash) (SignedImageIndex, error)

   // Signatures returns the set of signatures currently associated with this
   // image, or the empty equivalent if none are found.
   Signatures() (Signatures, error)

   // Attestations returns the set of attestations currently associated with this
   // image, or the empty equivalent if none are found.
   Attestations() (Attestations, error)
}
mattmoor commented 2 years ago

I think my inclination here is to land this as an interface under internal/oci and then we can start to create an implementation for accessing all of this information in a high-level way (moving the logic we have now behind the above interface) under internal/oci/remote

mattmoor commented 2 years ago

So I'm think that instead of having FetchSignatureForImage which linearizes a set of SignedPayloads that cosign.Verify then processes, we can surface an oci.Map function, which walks the images and signatures of an entity (possibly recursively) and calls some function on them.

This is based around a similar idea I used in mink here: https://github.com/mattmoor/mink/blob/7b163e07f73fbf5dfd7f442c691dc77873d36a93/pkg/bundles/map.go#L110-L118

I think what I have in mind is along the lines of (very roughly):

// Fn is the signature of the callback supplied to Map.  This function will be called on each of the
// entities and signatures contained within a parent entity (including the parent entity).
type Fn func(context.Context, oci.SignedEntity) error

// Map calls `fn` on the signed entity and each of its constituent entities (`SignedImageIndex`
// or `SignedImage`) transitively.  Any errors returned by an `fn` are returned by `Map`.
func Map(ctx context.Context, parent oci.SignedEntity, fn Fn) error {

Some of the things I'm debating:

Anyways, I'll let it keep marinating and hopefully play with some code layer 😁

mattmoor commented 2 years ago

I think the next piece of this to start looking at and peeling apart is the actual signing process. Today this all seems to flow through UploadSignature, which does a handful of things currently, including:

  1. Load any existing signatures,
  2. Scan through existing signatures to see whether what we're about to write is redundant,
  3. If it isn't redundant, add it to the pile of signatures,
  4. remote.Write the manifest with full set of signatures.

I've been playing around with this function, and there are a few things I'd like to do:

  1. Make the layer we build implement oci.Signature (possibly multiple signatures for different forms),
  2. Reorient duplicate detection around the oci.Signature interface (requires adding Annotations() to oci.Signature),
  3. Create a new mutate method for adding an oci.Signature to an oci.Signatures.
  4. Wrap it all up by feeding that oci.Signatures to remote.Write

My working form of this feels messy (thus my comment about multiple constructors for different forms), it's something like:

func UploadSignature(signature, payload []byte, dst name.Reference, opts UploadOpts) error {
    b64sig := base64.StdEncoding.EncodeToString(signature)
    l := &staticLayer{
        b:  payload,
        mt: ctypes.SimpleSigningMediaType,
        annotations: kmeta.UnionMaps(opts.AdditionalAnnotations, map[string]string{
            sigkey: b64sig,
        }),
        b64sig:   b64sig,
        certPEM:  string(opts.Cert),
        chainPEM: string(opts.Chain),
        bundle:   opts.Bundle,
    }
    // Preserve the default
    if opts.MediaType != "" {
        l.mt = types.MediaType(opts.MediaType)
    }

    if opts.Cert != nil {
        l.annotations[certkey] = l.certPEM
        l.annotations[chainkey] = l.chainPEM
    }
    if opts.Bundle != nil {
        b, err := swag.WriteJSON(opts.Bundle)
        if err != nil {
            return errors.Wrap(err, "marshaling bundle")
        }
        l.annotations[BundleKey] = string(b)
    }

...

Looking at the patterns of options being passed into this, I see the following variations

The ./cmd/cosign/cli/attach/sig.go form is the simplest, and passes no meaningful options other than transport stuff:

    return cremote.UploadSignature(sigBytes, payload, dstRef, cremote.UploadOpts{RemoteOpts: remoteOpts})

The ./cmd/cosign/cli/sign/sign.go form is a bit more complicated setting Cert, Chain and optionally Bundle:

        uo := cremote.UploadOpts{
            Cert:         sv.Cert,
            Chain:        sv.Chain,
            DupeDetector: sv,
            RemoteOpts:   remoteOpts,
        }

        // Check if the image is public (no auth in Get)
        uploadTLog, err := ShouldUploadToTlog(img, force, ko.RekorURL)
        if err != nil {
            return err
        }

        if uploadTLog {
            ...

            uo.Bundle = Bundle(entry)
            uo.AdditionalAnnotations = ParseAnnotations(entry)
        }

        fmt.Fprintln(os.Stderr, "Pushing signature to:", sigRef.String())
        if err := cremote.UploadSignature(sig, payload, sigRef, uo); err != nil {

Finally, the ./cmd/cosign/cli/attest.go form is very slightly more complicated, as it also sets the MediaType, but it also does some funny business leaving the signature empty (see the final comment):

    uo := cremote.UploadOpts{
        Cert:         sv.Cert,
        Chain:        sv.Chain,
        DupeDetector: sv,
        RemoteOpts:   remoteOpts,
        MediaType:    types.DssePayloadType,
    }

    uploadTLog, err := sign.ShouldUploadToTlog(ref, force, ko.RekorURL)
    if err != nil {
        return err
    }

    if uploadTLog {
        ...

        uo.Bundle = sign.Bundle(entry)
        uo.AdditionalAnnotations = sign.ParseAnnotations(entry)
    }

    fmt.Fprintln(os.Stderr, "Pushing attestation to:", attRef.String())
    // An attestation represents both the signature and payload. So store the entire thing
    // in the payload field since they can get large
    return cremote.UploadSignature([]byte{}, sig, attRef, uo)

So I'm wondering whether we should have a couple different constructors for these, with some sort of options to pass Bundle as needed or a mutate function to infuse the layer with the bundle post-facto...

If folks have thoughts on this, I'd love to hear them, otherwise I'll take a crack at it next time I have some bandwidth to play around. 🤞

dlorenc commented 2 years ago

I'd be fine dropping the duplicate detection stuff. I liked it when we put it in, but it doesn't seem to help in practice.

mattmoor commented 2 years ago

I already refactored it a bit, so I'll leave it alone for now unless it gets annoying again 🤷

mattmoor commented 2 years ago

Ok, so I think for the next chunk, what I'm looking at is the logic in SignCmd which intertwines recursively walking indices with signing and publishing.

The sketch I'm starting to build in my head is essentially replacing all of the recursive handling with mutate.Map. Something along the lines of:

    for _, inputImg := range imgs {
        ref, err := GetAttachedImageRef(inputImg, attachment, remoteOpts...)
        if err != nil {
            return fmt.Errorf("unable to resolve attachment %s for image %s", attachment, inputImg)
        }

        se, err := ociremote.SignedEntity(ref, regOpts.ClientOpts(ctx)...)
        if err != nil {
            return err
        }

        _, err = mutate.Map(ctx, se, func(ctx context.Context, se oci.SignedEntity) (oci.SignedEntity, error) {
...

We'll fold the bulk of the existing signing logic into this function, and then replace UploadSignature with the following epilog:

            // Create the new signature for this entity.
            sig, err := static.NewSignature(payload, b64sig, opts...)
            if err != nil {
                return nil, err
            }

            // Attach the signature to the entity.
            newSE, err := mutate.SignEntity(se, sig, mutate.WithDedup(dd))
            if err != nil {
                return nil, err
            }

            // Publish the signatures associated with this entity
            if err := ociremote.WriteSignatures(digest, newSE, regOpts.ClientOpts(ctx)...); err != nil {
                return nil, err
            }
            return se, ErrDone   // ErrDone is optionally set to nil or ErrSkipChildren depending on whether we are recursive

Several of the helpers in this epilog are (at present) vaporware, but the next step would be to knock these off and then chase this shape.

mattmoor commented 2 years ago

I'm going to drop the Attestations path for now, which is wholly NYI everywhere, since everyone is just treating these as a different form of oci.Signatures and accessing them by overriding the signature tag suffix.

https://github.com/sigstore/cosign/pull/769