tektoncd / pipeline

A cloud-native Pipeline resource.
https://tekton.dev
Apache License 2.0
8.49k stars 1.78k forks source link

Publish task fails, IMAGES result too large #4282

Open afrittoli opened 3 years ago

afrittoli commented 3 years ago

Expected Behavior

It is possible to release Tekton Pipelines

Actual Behavior

The publish task fails because the IMAGES result it too large:

{"level":"fatal","ts":1633523339.3185616,"caller":"entrypoint/entrypointer.go:203","msg":"Error while handling results: Termination message is above max allowed size 4096, caused by large task result.","stacktrace":"github.com/tektoncd/pipeline/pkg/entrypoint.Entrypointer.Go\n\tgithub.com/tektoncd/pipeline/pkg/entrypoint/entrypointer.go:203\nmain.main\n\tgithub.com/tektoncd/pipeline/cmd/entrypoint/main.go:126\nruntime.main\n\truntime/proc.go:225"}

Steps to Reproduce the Problem

  1. Trigger a nightly release

Additional Info

The IMAGES result is used by Tekton Chais to sign the container images. The result includes all the container images produced by ko plus all their copies to the various regional registries.

Server Version: version.Info{Major:"1", Minor:"19+", GitVersion:"v1.19.13-gke.1900", GitCommit:"ee714a7b695ca42b9bd0c8fe2c0159024cdcba5e", GitTreeState:"clean", BuildDate:"2021-08-11T09:19:42Z", GoVersion:"go1.15.13b5", Compiler:"gc", Platform:"linux/amd64"}
Client version: 0.19.0
Pipeline version: v0.27.3
Triggers version: v0.16.0
Dashboard version: v0.19.0
afrittoli commented 3 years ago

/cc @barthy1 @vdemeester @bobcatfish @priyawadhwa

vdemeester commented 3 years ago

Ah 😅 This bring some light and urgency on the TEP around this problem then 🙃

afrittoli commented 3 years ago

Heh, indeed... but we'll need a solution before the TEP though. Using multiple results would not help, we would need to use multiple tasks 😅

afrittoli commented 3 years ago

The result looks like this:

gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/controller@sha256:8e749dc794d6c26b54842599eaa61b6ecbc1161d4c8207f6227089a74272d838,
us.gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/controller@sha256:8e749dc794d6c26b54842599eaa61b6ecbc1161d4c8207f6227089a74272d838,
eu.gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/controller@sha256:8e749dc794d6c26b54842599eaa61b6ecbc1161d4c8207f6227089a74272d838,
asia.gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/controller@sha256:8e749dc794d6c26b54842599eaa61b6ecbc1161d4c8207f6227089a74272d838,
gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/kubeconfigwriter@sha256:fa6706ae3562ddaa3cf1efbfe3bf56cb1a07bcf9bdfbb191dc79b0b7cf3bd889,
us.gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/kubeconfigwriter@sha256:fa6706ae3562ddaa3cf1efbfe3bf56cb1a07bcf9bdfbb191dc79b0b7cf3bd889,
eu.gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/kubeconfigwriter@sha256:fa6706ae3562ddaa3cf1efbfe3bf56cb1a07bcf9bdfbb191dc79b0b7cf3bd889,
asia.gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/kubeconfigwriter@sha256:fa6706ae3562ddaa3cf1efbfe3bf56cb1a07bcf9bdfbb191dc79b0b7cf3bd889,
gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/git-init@sha256:64cfa7edd4243ecac8287b475ddd7745b44b0b2be2a21065aea5b202762d0bad,
us.gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/git-init@sha256:64cfa7edd4243ecac8287b475ddd7745b44b0b2be2a21065aea5b202762d0bad,
eu.gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/git-init@sha256:64cfa7edd4243ecac8287b475ddd7745b44b0b2be2a21065aea5b202762d0bad,
asia.gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/git-init@sha256:64cfa7edd4243ecac8287b475ddd7745b44b0b2be2a21065aea5b202762d0bad,
gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/entrypoint@sha256:ae20b7863effaa2cc620acc9cf6ff1f80681aab7e84419a388f3579a6392cb2c,
us.gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/entrypoint@sha256:ae20b7863effaa2cc620acc9cf6ff1f80681aab7e84419a388f3579a6392cb2c,
eu.gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/entrypoint@sha256:ae20b7863effaa2cc620acc9cf6ff1f80681aab7e84419a388f3579a6392cb2c,
asia.gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/entrypoint@sha256:ae20b7863effaa2cc620acc9cf6ff1f80681aab7e84419a388f3579a6392cb2c,
gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/nop@sha256:22308e68d9d550ea3d5af81f289529e6ab2b2d0f4e34b419aa3b4c867c8d7cbc,
us.gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/nop@sha256:22308e68d9d550ea3d5af81f289529e6ab2b2d0f4e34b419aa3b4c867c8d7cbc,
eu.gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/nop@sha256:22308e68d9d550ea3d5af81f289529e6ab2b2d0f4e34b419aa3b4c867c8d7cbc,
asia.gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/nop@sha256:22308e68d9d550ea3d5af81f289529e6ab2b2d0f4e34b419aa3b4c867c8d7cbc,
gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/imagedigestexporter@sha256:5f2ddfddf0930cd1907bec0006a613dbfce2d69184d7ed552acbec1d769e50dc,
us.gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/imagedigestexporter@sha256:5f2ddfddf0930cd1907bec0006a613dbfce2d69184d7ed552acbec1d769e50dc,
eu.gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/imagedigestexporter@sha256:5f2ddfddf0930cd1907bec0006a613dbfce2d69184d7ed552acbec1d769e50dc,
asia.gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/imagedigestexporter@sha256:5f2ddfddf0930cd1907bec0006a613dbfce2d69184d7ed552acbec1d769e50dc,
gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/pullrequest-init@sha256:c43f269ea4e66e85bb9611c89e7d2fe681b520286243a77e75479d338d0a84bc,
us.gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/pullrequest-init@sha256:c43f269ea4e66e85bb9611c89e7d2fe681b520286243a77e75479d338d0a84bc,
eu.gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/pullrequest-init@sha256:c43f269ea4e66e85bb9611c89e7d2fe681b520286243a77e75479d338d0a84bc,
asia.gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/pullrequest-init@sha256:c43f269ea4e66e85bb9611c89e7d2fe681b520286243a77e75479d338d0a84bc,
gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/webhook@sha256:6b9b7afe486afb7f71e84958a53013603b32dff3cc90c140d3b5c0606fe291c2,
us.gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/webhook@sha256:6b9b7afe486afb7f71e84958a53013603b32dff3cc90c140d3b5c0606fe291c2,
eu.gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/webhook@sha256:6b9b7afe486afb7f71e84958a53013603b32dff3cc90c140d3b5c0606fe291c2,
asia.gcr.io/tekton-nightly/github.com/tektoncd/pipeline/cmd/webhook@sha256:6b9b7afe486afb7f71e84958a53013603b32dff3cc90c140d3b5c0606fe291c2,

That is 4572 characters, which won't fit. I think the only alternative for now is to only sign the image on gcr.io and we can start signing the geo copies once we solve the issue on results size.

vdemeester commented 3 years ago

Heh, indeed... but we'll need a solution before the TEP though. Using multiple results would not help, we would need to use multiple tasks sweat_smile

It wouldn't because of the termination message limit thingy right ?

afrittoli commented 3 years ago

Heh, indeed... but we'll need a solution before the TEP though. Using multiple results would not help, we would need to use multiple tasks sweat_smile

It wouldn't because of the termination message limit thingy right ?

Yes, indeed. We store results in the POD termination message, so having multiple results or multiple steps does not help.

pritidesai commented 3 years ago

Heh, indeed... but we'll need a solution before the TEP though. Using multiple results would not help, we would need to use multiple tasks sweat_smile

It wouldn't because of the termination message limit thingy right ?

Yes, indeed. We store results in the POD termination message, so having multiple results or multiple steps does not help.

So indirectly Chains has the limitation and does not support a taskRun producing so many images 😞

priyawadhwa commented 3 years ago

Heh, indeed... but we'll need a solution before the TEP though. Using multiple results would not help, we would need to use multiple tasks 😅

We ended up having to use multiple tasks for distroless, but it's pretty hacky 😕 Signing only gcr.io seems like a good short term solution. We also have a branch in Chains with a prototype Chains API, which could be reworked a little to also accept large results 🤔

afrittoli commented 3 years ago

Since the workaround was merged, I downgraded the priority of the issue now.

afrittoli commented 3 years ago

Successful nightly run: https://dashboard.dogfooding.tekton.dev/#/namespaces/tekton-nightly/pipelineruns/pipeline-release-nightly-gp4wm?pipelineTask=git-clone&step=clone

pritidesai commented 3 years ago

@priyawadhwa any thoughts on using something other than a single task result IMAGES? Anything I can think of sounds hacky, for example, a dedicated section in a taskRun to maintain the list of images other than the task result?

EDIT: something like taskRun.status.images in addition to IMAGES task result

afrittoli commented 3 years ago

@priyawadhwa any thoughts on using something other than a single task result IMAGES? Anything I can think of sounds hacky, for example, a dedicated section in a taskRun to maintain the list of images other than the task result?

EDIT: something like taskRun.status.images in addition to IMAGES task result

@pritidesai if we go down that route we might consider adding an artifact section instead, as it's not only container images that we might work with - which feels a bit like going back to PipelineResources :]

priyawadhwa commented 3 years ago

@pritidesai if we go down that route we might consider adding an artifact section instead, as it's not only container images that we might work with - which feels a bit like going back to PipelineResources :]

From what I remember you also had to specify your PipelineResources upfront (but I might have that wrong!) If that's the case it can get pretty inconvenient if you're building more than 3 images in a task. The nice thing about IMAGES result is that it's dynamic in that way.

pritidesai commented 3 years ago

The image resource might not work with such dynamism. The outputs.resources has to list the number of images a task is going to produce in advance.

  outputs:
    resources:
    - name: builtImage
      type: image

I was thinking of a solution which is a little more structured than a task result.

tekton-robot commented 2 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale with a justification. Stale issues rot after an additional 30d of inactivity and eventually close. If this issue is safe to close now please do so with /close with a justification. If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

tekton-robot commented 2 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten with a justification. Rotten issues close after an additional 30d of inactivity. If this issue is safe to close now please do so with /close with a justification. If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

pritidesai commented 2 years ago

this still needs resolution, could be addressed by https://github.com/tektoncd/pipeline/issues/4012 and TEP-0086.

/lifecycle frozen

afrittoli commented 1 year ago

We discussed this in the Tekton Data Interface working group. @wlynch commented :

We probably don't want to doing individual signing events for each image in each region.

These are going to be different signing events so you'll end up with different signature checksums because of embedded timestamps which throws people off sometimes. We likely want to promote each image to each region with their existing signatures via cosign cp. See https://github.com/kubernetes/registry.k8s.io/issues/187 for a similar issue.

afrittoli commented 1 year ago

Thanks @wlynch - good point, I agree we should not sign regional copies separately.

Since the signing happens out of band (performed by chain) we cannot really copy the signature to the regional copies, unless we trigger another pipeline after the signature happen. This is probably ok since signature files are much smaller than the images.

We could copy the SBOM files around, but that's a separate issue. I would propose we close this one.