tektoncd / pipeline

A cloud-native Pipeline resource.
https://tekton.dev
Apache License 2.0
8.51k stars 1.78k forks source link

Handle non-printable characters/binary data in task results #1939

Closed eddie4941 closed 4 years ago

eddie4941 commented 4 years ago

Abstract

In the current design of Task Results, values that are output by a task are read from the result files (/tekton/results/<resultName>) and stored in the TaskRun.Status as is. This means that result values with non-printable characters/binary data can end up in the taskrun.status.taskResults making it look awkward when fetching it via k8 api.

For example, here's a task that outputs a result with and without a newline. The result with the new line is formatted differently than the one without. The output would be even worse if there were more new lines/other binary data and might pose problems when as pass the data to subsequent tasks.

apiVersion: tekton.dev/v1alpha1
 kind: TaskRun
 ...
 status:
 ...
    taskResults:
    - name: current-date-newline
      value: |
        1579891332
    - name: current-date-no-newline
      value: "1579891331"

Potential solutions

One approach to avoid this would be to encode the result in base64. We could do the following:

  1. Read results from the result files (/tekton/results/<resultName>)
  2. Store the content of the result file as a base64 string in the termination log
  3. Store the base64 encoded value in the taskrun.status.taskResults[].value

This would mean that all taskrun.status.taskResults[].value would be base64 encoded. Therefore we may need to adjust how the controller passes data along to subsequent task to potentially decode the data.

Another approach is to add a type field to the task.spec.results[] to indicate the type of date the result is meant to hold. The type field would have at least two values: string and bytes.

e.g.

apiVersion: tekton.dev/v1alpha1
kind: Task
...
spec:
  results:
    - name: my-string-result
      type: string
    - name: my-bytes-results
      type: bytes

results of type string would be handled as they are today, but results of type bytes would be base64 encoded as described above and passed to subsequent tasks as base64 encoded strings to avoid decoding issues.

eddie4941 commented 4 years ago

cc @sbwsg

eddie4941 commented 4 years ago

Of course, it might also be decided that non-printable/binary data is not meant to be passed via task results. In this case, we should call this out explicitly somewhere.

ghost commented 4 years ago

Couple other considerations here:

  1. at the moment the only types of variables we have in Tekton are strings and arrays of strings. iow $(params.foo) always refers to either a string value or array value. So if we introduce a bytes result and want that to be interpolated as a variable replacement in subsequent tasks we'd need to ensure it gets passed along as base64. But if it's a string value we'd probably want to interpolate its literal value, not the base64 encoded string.

  2. base64 inflates the size of a message by approximately 4/3. We'd probably want to error out or otherwise message to the user if their result encoded too large to fit in the termination log and got truncated.

vdemeester commented 4 years ago

/kind question /kind feature /area api

tekton-robot commented 4 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. If this issue is safe to close now please do so with /close.

/lifecycle stale

Send feedback to tektoncd/plumbing.

tekton-robot commented 4 years ago

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

/close

Send feedback to tektoncd/plumbing.

tekton-robot commented 4 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity. If this issue is safe to close now please do so with /close.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

tekton-robot commented 4 years ago

@tekton-robot: Closing this issue.

In response to [this](https://github.com/tektoncd/pipeline/issues/1939#issuecomment-673309991): >Rotten issues close after 30d of inactivity. >Reopen the issue with `/reopen`. >Mark the issue as fresh with `/remove-lifecycle rotten`. > >/close > >Send feedback to [tektoncd/plumbing](https://github.com/tektoncd/plumbing). Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.