tektoncd / chains

Supply Chain Security in Tekton Pipelines
Apache License 2.0
245 stars 126 forks source link

Chains adds `chains.tekton.dev/signed=true` annotation even if no keys supplied #858

Open PuneetPunamiya opened 1 year ago

PuneetPunamiya commented 1 year ago

For example :

I create this simple taskrun which is mentioned in the getting-started tutorial

apiVersion: tekton.dev/v1beta1
kind: TaskRun
metadata:
  name: build-push-run-output-image-test
spec:
  serviceAccountName: ""
  taskSpec:
    results:
    - name: IMAGE_URL
      type: string
    - name: IMAGE_DIGEST
      type: string
    steps:
    - name: create-image
      image: busybox
      script: |-
        #!/usr/bin/env sh
        echo 'gcr.io/foo/bar' | tee $(results.IMAGE_URL.path)
        echo 'sha256:05f95b26ed10668b7183c1e2da98610e91372fa9f510046d4ce5812addad86b5' | tee $(results.IMAGE_DIGEST.path)
kubectl create -f https://raw.githubusercontent.com/tektoncd/chains/main/examples/taskruns/task-output-image.yaml

Now even if no field is provided in the the chain-config configMap it adds annotation chains.tekton.dev/signed=true in the taskrun

╰─ tkn tr describe --last 
Name:              build-push-run-output-image-test
Namespace:         default
Service Account:   default
Timeout:           1h0m0s
Labels:
 app.kubernetes.io/managed-by=tekton-pipelines
Annotations:
 chains.tekton.dev/signed=true
 pipeline.tekton.dev/release=c802069

🌑️  Status

STARTED         DURATION    STATUS
2 minutes ago   28s         Succeeded

πŸ“ Results

 NAME             VALUE
 βˆ™ IMAGE_DIGEST   sha256:05f95b26ed10668b7183c1e2da98610e91372fa9f510046d4ce5812addad86b5
 βˆ™ IMAGE_URL      gcr.io/foo/bar

🦢 Steps

 NAME             STATUS
 βˆ™ create-image   Completed
lcarva commented 1 year ago

Hi @PuneetPunamiya! Thanks for reporting this.

An empty config value means the default value is loaded. The config values are documented here. The default configuration does require that the signing-secret is created which appears to be missing in your case: error configuring x509 signer: no valid private key found, looked for: [x509.pem, cosign.key]

Chains uses the mentioned annotation to indicate that it is done processing the resource. So if it sees a resource with that annotation, then it just ignores it. Otherwise, it may constantly keep retrying. signed is probably not the best name for this though.

I think the question here is how should Chains behave if it is misconfigured. Should it stay out of the way, or should it make this obvious to users creating TaskRun and PipelineRun resources? Or maybe, the Chains controller should have some sort of config validation at start up, making it obvious to whoever is administrating the controller.

PuneetPunamiya commented 1 year ago

Thanks @lcarva for the response πŸ‘πŸ»

An empty config value means the default value is loaded.

Yep agreed, I think in that case we should add those fields in the config map, so that user is aware which config values are set by default. wdyt ?

The default configuration does require that the signing-secret is created which appears to be missing in your case: error configuring x509 signer: no valid private key found, looked for: [x509.pem, cosign.key]

So if I'm not getting you wrong, we do have signing-secret created once chains is installed, but I think what you meant is the signing-secret was not updated/configured with the cosign cli and that is why it gives the following error. Also once we update the keys by executing cosign generate .. command, the signing fails πŸ‘πŸ»

I think the question here is how should Chains behave if it is misconfigured.

Yes exactly, I purposely added no config values from whic are documented here assuming the signing should fail

the Chains controller should have some sort of config validation at start up, making it obvious to whoever is administrating the controller.

Yeah, so if user doesn't provide any config values, validation returns a warning sort of and doesn't reconcile to take further actions. wdyt ?

lcarva commented 1 year ago

Yep agreed, I think in that case we should add those fields in the config map, so that user is aware which config values are set by default. wdyt ?

Do you mean here? If so, +1.

Yeah, so if user doesn't provide any config values, validation returns a warning sort of and doesn't reconcile to take further actions. wdyt ?

A warning may be easy to miss. Is preventing the controller from starting (by crashing it?) too heavy handed?

PuneetPunamiya commented 1 year ago

Yep agreed, I think in that case we should add those fields in the config map, so that user is aware which config values are set by default. wdyt ?

Do you mean here? If so, +1.

Yeah, I meant here only

Yeah, so if user doesn't provide any config values, validation returns a warning sort of and doesn't reconcile to take further actions. wdyt ?

A warning may be easy to miss. Is preventing the controller from starting (by crashing it?) too heavy handed?

How about erroring out in the controller logs and controller not proceeding further steps instead of crashing it ?

concaf commented 9 months ago

An empty config value means the default value is loaded. The config values are documented here. The default configuration does require that the signing-secret is created which appears to be missing in your case: error configuring x509 signer: no valid private key found, looked for: [x509.pem, cosign.key]

Chains uses the mentioned annotation to indicate that it is done processing the resource. So if it sees a resource with that annotation, then it just ignores it. Otherwise, it may constantly keep retrying. signed is probably not the best name for this though.

@lcarva i agree the signed is not the best name to indicate the processing of that resource; like you said, to a user chains.tekton.dev/signed=true indicates that chains was able to successfully sign the resource, hence this is very misleading.

what's the downside of setting chains.tekton.dev/signed=false/failed in this case?

lcarva commented 9 months ago

what's the downside of setting chains.tekton.dev/signed=false/failed in this case?

Setting it to "failed" is probably the right behavior. It will be challenging to get it right as we need to distinguish from use cases where there's just nothing to sign, e.g. https://github.com/tektoncd/chains/issues/458.

cc @wlynch, @chitrangpatel

concaf commented 9 months ago

wdyt about specifying a list of values that accurately describe the state? e.g.:

concaf commented 9 months ago

while i agree that chains.tekton.dev/signed should have multiple states as above ^ , but also in the absence of any keys (which means chains has nothing to sign with), the controller should not watch / process any taskruns/pipelineruns at all - why waste resources in going through each workload and add chains.tekton.dev/signed=skipped annotation

one data point to support this is #979 where we're already seeing slowness from chains controller due to needing to process a large number of tekton workloads - while that issue will be fixed differently, the chains controller should try to not watch workloads if not required, wdyt?

lcarva commented 9 months ago

I agree with @concaf here. Another way to look at it is the different personas that rely on Chains.

There's the user who is running Pipelines/Tasks, and the admin who is responsible for configuring Tekton Chains. In some cases, these may be the same person, but think we should assume that they are separate people with different access levels.

If Chains is misconfigured, there's nothing the user can do. The admin is the one responsible. With this premise, if Chains is misconfigured, we should make that obvious to the admin, while also minimizing the impact to the user.

This is why I suggested earlier on that if Chains is misconfigured, it should just crash. This should effectively allow the admin to fix things and delay Chains from processing any Run resources. Once the issue is resolved, Chains will catch up processing the Run resources.

lcarva commented 8 months ago

We discussed this in today's meeting. The overall consensus is to not proceed with the chains.tekton.dev/signed=skipped annotation approach. One of the promises of Chains is that it processes every resource in the cluster (if Chains is properly configured). This would violate that.

An approach that makes it obvious the controller is misconfigured is ideal, e.g. performing a basic config check at start up and crash if anything seems wrong (missing cosign key for example). This won't catch all the cases, but it's a starting point.

Additionally, we could create a mechanism that propagates errors back to the user. This could be done by using a new annotation or by adding a new status condition on the resource. This should bridge the gap of issues which are within the user's control to fix, e.g. malformatted linked secret.