tektoncd / chains

Supply Chain Security in Tekton Pipelines
Apache License 2.0
240 stars 125 forks source link

Index out of range error #1099

Closed avumdonny closed 3 months ago

avumdonny commented 3 months ago

Expected Behavior

The Tekton Chain Controller pod continues to run without any errors creating a supply chain for the pipelineruns and taskruns.

Actual Behavior

The Tekton Chains Controller pod keeps restarting every couple of minutes with the following error. Issue started about 10 days ago, but only noticed it in the event logs a couple days ago.

{"level":"warn","ts":"2024-04-08T20:40:12.085Z","logger":"watcher","caller":"chains/signing.go:72","msg":"error configuring x509 signer: no valid private key found, looked for: [x509.pem, cosign.key]","commit":"ebcd9c2","knative.dev/controller":"github.com.tektoncd.chains.pkg.reconciler.pipelinerun.Reconciler","knative.dev/kind":"tekton.dev.PipelineRun","knative.dev/traceid":"53b65d0f-5927-4bbf-a2eb-608aafa4f05c","knative.dev/key":"##########"} panic: runtime error: index out of range [1] with length 1

goroutine 128 [running]: github.com/tektoncd/chains/pkg/chains/formats/slsa/v1/pipelinerun.buildConfig({0x3ca7638?, 0xc00716f290?}, 0xc004d443a0) /go/src/github.com/tektoncd/chains/pkg/chains/formats/slsa/v1/pipelinerun/pipelinerun.go:106 +0x1129 github.com/tektoncd/chains/pkg/chains/formats/slsa/v1/pipelinerun.GenerateAttestation({0x3ca7638, 0xc00716f290}, 0xc00716f290?, 0xc0059c5620) /go/src/github.com/tektoncd/chains/pkg/chains/formats/slsa/v1/pipelinerun/pipelinerun.go:69 +0x19a github.com/tektoncd/chains/pkg/chains/formats/slsa/v1.(InTotoIte6).CreatePayload(0xc008b041f8, {0x3ca7638, 0xc00716f290}, {0x34f48c0?, 0xc004a38420?}) /go/src/github.com/tektoncd/chains/pkg/chains/formats/slsa/v1/intotoite6.go:84 +0x525 github.com/tektoncd/chains/pkg/chains.(ObjectSigner).Sign(0xc0004e2e40, {0x3ca7638, 0xc00716f290}, {0x3cdcb50, 0xc004a38420}) /go/src/github.com/tektoncd/chains/pkg/chains/signing.go:147 +0x5d7 github.com/tektoncd/chains/pkg/reconciler/pipelinerun.(Reconciler).FinalizeKind(0xc0004e2e80, {0x3ca7638, 0xc00716f290}, 0xc008a286c0) /go/src/github.com/tektoncd/chains/pkg/reconciler/pipelinerun/pipelinerun.go:109 +0x8cc github.com/tektoncd/pipeline/pkg/client/injection/reconciler/pipeline/v1/pipelinerun.(reconcilerImpl).Reconcile(0xc0006e9c20, {0x3ca7638, 0xc00716f1d0}, {0xc0082f4c40, 0x35}) /go/src/github.com/tektoncd/chains/vendor/github.com/tektoncd/pipeline/pkg/client/injection/reconciler/pipeline/v1/pipelinerun/reconciler.go:241 +0x3be knative.dev/pkg/controller.(Impl).processNextWorkItem(0xc0001756e0) /go/src/github.com/tektoncd/chains/vendor/knative.dev/pkg/controller/controller.go:542 +0x4cd knative.dev/pkg/controller.(Impl).RunContext.func3() /go/src/github.com/tektoncd/chains/vendor/knative.dev/pkg/controller/controller.go:491 +0x68 created by knative.dev/pkg/controller.(*Impl).RunContext /go/src/github.com/tektoncd/chains/vendor/knative.dev/pkg/controller/controller.go:489 +0x354

Steps to Reproduce the Problem

I don't know what caused the issue, so I don't know how to replicate it.

Additional Info

lcarva commented 3 months ago

Thanks for filing this issue!

Could you please share the version of Tekton Chains you are using? I see that you have already shared the version of Tekton Pipeline which is useful as well.

Going by the assumption that the code from main is the same as in the version of Chains you are running, it looks like the index out of range error is coming from here:

steps := []attest.StepAttestation{}
for i, stepState := range tr.Status.Steps {
    step := tr.Status.TaskSpec.Steps[i]  // <-- here
    steps = append(steps, attest.Step(&step, &stepState))
}

I can't think of a situation where Status.Steps from a TaskRun would have more items than its Status.TaskSpec.Steps but that seems to be what is happening. You have a TaskRun in your cluster that does exhibit this behavior. If you do find it, it may provide more insight on what is actually happening. (If you do delete the TaskRun, then Chains should proceed without errors.)

Regardless, Chains should be more careful by first verifying the index before accessing it, reporting an error if necessary.

NOTE: error configuring x509 signer: no valid private key found, looked for: [x509.pem, cosign.key] usually means you haven't configured a cosign key to be used by Chains. Check out our getting started tutorial for more info on how to do this. I'm confident that this is unrelated to the index out of range issue.

avumdonny commented 3 months ago

Hello lcarva,

Thank you for your response.

The version of chains is v0.20.1. The logs for the chains controller kept stopping around the same place in the same namespace. I thought that deleting the completed/failed pipelineruns may have resolved the issue, but are still pending deletion since Friday because of a finalizer in each are set to chains.tekton.dev. Would it cause any issues to remove these finalizers?

lcarva commented 3 months ago

I thought that deleting the completed/failed pipelineruns may have resolved the issue, but are still pending deletion since Friday because of a finalizer in each are set to chains.tekton.dev. Would it cause any issues to remove these finalizers?

Removing the Chains finalizer just means Chains may not have a chance to process those particular PiepelineRuns before they are deleted. I think in this situation, you want to bypass that safeguard.

avumdonny commented 3 months ago

Removing the finalizers from the pipelineruns allowed the API to finish deleting them and the related taskruns. I deleted the current chains pod and a new one was created. I haven't seen any errors yet and the alert cleared as well. I believe this issue is resolved. Thank you for your help.