namloc2001 commented 3 years ago

Hi, I originally opened this here: https://github.com/tektoncd/pipeline/issues/3625 but was requested/informed to reopen it here. (@gabemontero @vdemeester @siamaksade @sbose78 FYI)

Expected Behavior

I expect the serviceAccount associated with the pipelineRun to have SCC controls applied to it. If it is a new serviceAccount (or any serviceAccount that hasn't been explicitly granted alignment to other SCCs elsewhere), I expect these to be in line with the restricted SCC.

Actual Behavior

I am deploying a pipelineRun script via oc create -f pipeline-run.yaml after having logged into the cluster with my (cluster-admin) personal account. Config is:

apiVersion: tekton.dev/v1alpha1
kind: PipelineRun
metadata:
  generateName: demo-sysdig-inline-permission-
spec:
  pipelineRef:
    name: sysdig-pipeline
  podTemplate:
    securityContext:
      runAsUser: 1005
  serviceAccountName: 'new-pipelinerunner'
  resources:
  - name: container-image
    resourceRef:
      name: container-image
  taskRunSpecs:
    - pipelineTaskName: sysdig-scan-inline-new
      taskServiceAccountName: 'new-pipelinerunner'
      taskPodTemplate:  
        securityContext:
          runAsUser: 1001
          fsGroup: 1001

The resultant pod is deployed with UID=1005, however I have not granted the new-pipelinerunner serviceAccount the capabilities of any SCCs, therefore by default it should be only able to function in line with the restricted SCC settings, one of which is that the runAsUser is the project UID range (openshift.io/sa.scc.uid-range: 1005450000/10000).

And yet if I run the id command on my pipeline pod, I get:

oc exec -it demo-sysdig-inline-permission-gzwn9-sysdig-scan-8r5hx-pod-52rbz -- sh -c "id"
uid=1005(1005) gid=0(root) groups=0(root)

Steps to Reproduce the Problem

Create a serviceAccount on OpenShift in the tekton-pipelines namespace:
```
apiVersion: v1
kind: ServiceAccount
metadata:
name: new-pipelinerunner
```
Run any pipelineRun with this serviceAccount attached, either do or don't specify podTemplate.securityContext on the pipelineRun.
With the pod running, confirm the UID of the user, it should be in the project range (oc exec -it <pod-name> [-c container_name] -- sh -c "id").

Additional Info

Kubernetes version:

Output of kubectl version:

Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.4", GitCommit:"d360454c9bcd1634cf4cc52d1867af5491dc9c5f", GitTreeState:"clean", BuildDate:"2020-11-11T13:17:17Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"16+", GitVersion:"v1.16.2+853223d", GitCommit:"853223d", GitTreeState:"clean", BuildDate:"2020-10-14T15:02:59Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}

Output of oc version:


oc v3.11.0+0cbc58b
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server <I've redacted> kubernetes v1.16.2+853223d


- Tekton Pipeline version:

  **Output of `tkn version` or `kubectl get pods -n tekton-pipelines -l app=tekton-pipelines-controller -o=jsonpath='{.items[0].metadata.labels.version}'`**

v0.15.2


<!-- Any other additional information -->
This is running on IBM Cloud Red Hat (ROKS), version: 4.3.40_1546:

System Info: Kernel Version: 3.10.0-1160.6.1.el7.x86_64 OS Image: Red Hat Operating System: linux Architecture: amd64 Container Runtime Version: cri-o://1.16.6-18.rhaos4.3.git538d861.el7 Kubelet Version: v1.16.2+853223d Kube-Proxy Version: v1.16.2+853223d



Update: it would appear that my user is influencing the SCC being used as the anyuid SCC is being prioritised over restricted SCC, in line with what is explained here:

    By default, the anyuid SCC granted to cluster administrators is given priority in their SCC set. This allows cluster administrators to run pods as any user by without specifying a RunAsUser on the pod’s SecurityContext. The administrator may still specify a RunAsUser if they wish.

(https://docs.openshift.com/container-platform/3.6/architecture/additional_concepts/authorization.html#scc-prioritization)

I've never encountered this before because I have always created deployments which create pods for me, and do this via Helm as well. If I create a deployment, which in turn creates my pod, the serviceAccount on the deployment dictates the SCC selected and thus will use restricted SCC unless I permit the serviceAccount access to a less restrictive SCC.

So my question is: can Tekton be configured to use the same relationship that a deployment serviceAccount has over a pod for SCCs? Can the pipelineRun (into which we declare a serviceAccount) serve as the "deployment" to create the pod and thus the pipelineRun serviceAccount dictates the SCC selected, rather than this changing based on the human user that instigated it?

namloc2001 commented 3 years ago

@vdemeester happy to move the convo (from https://github.com/tektoncd/pipeline/issues/3625) to here if it's more OpenShift related. Also including @sbose78 given their changes in #503. My question being:

Is there a reason why privileged wasn't downgraded to restricted? That way the pipeline pods are implemented with minimal SCC permissions and if additional permissions are required above restricted SCC, we apply these to the SA we attach to the pipeline- or taskRun? With the current method, I don't believe I can deploy using the restricted SCC.

I can see from #503 and #504 that privileged was changed to (originally anyuid, but then further to) nonroot SCC. So via the changes, the SA tekton-pipelines-controller is granted nonroot now.

What I'm trying to work out is whether this could go further (i.e. restricted SCC), because in 00-release.yaml I can see:

      containers:
      - name: tekton-pipelines-controller
        image: quay.io/openshift-pipeline/tektoncd-pipeline-controller:v0.18.0
        args: [
         ...
         ...
          # This is gcr.io/google.com/cloudsdktool/cloud-sdk:302.0.0-slim
          "-gsutil-image", "gcr.io/google.com/cloudsdktool/cloud-sdk@sha256:27b2c22bf259d9bc1a291e99c63791ba0c27a04d2db0a43241ba0f1f20f4067f",
          # The shell image must be root in order to create directories and copy files to PVCs.
          # gcr.io/distroless/base:debug as of October 16, 2020
          "-shell-image", "registry.access.redhat.com/ubi8/ubi-minimal:latest"
        ...
        ...
        securityContext:
          allowPrivilegeEscalation: false
          # User 65532 is the distroless nonroot user ID

and

      containers:
      - name: webhook
        # This is the Go import path for the binary that is containerized
        # and substituted here.
        image: quay.io/openshift-pipeline/tektoncd-pipeline-webhook:v0.18.0
        ...
        ...
        securityContext:
          allowPrivilegeEscalation: false
          # User 65532 is the distroless nonroot user ID

So with reference to tekton-pipelines-controller does the shell-image (registry.access.redhat.com/ubi8/ubi-minimal:latest) get launched as root? And if so, how given that the SCC now aligned to this is nonroot?

Assuming that all can/must now run as nonroot, does that mean this deployment can take place under restricted SCC? Or are other requirements for the deployment stopping this?

aelbarkani commented 3 years ago

This is definitely a security issue which leads to privilege escalation. I don't think it is wise for OpenShift Pipelines to go GA without solving this issue.

namloc2001 commented 3 years ago

@aelbarkani, assuming the setting change from privileged SCC has been made to change it to nonroot SCC (rather than anyuid SCC), it is not a security issue which leads to privilege escalation (as far as I am aware). Restricted and nonroot SCC are pretty much identical with the exception of the obvious difference being that nonroot doesn't force the UID to be assigned from the project range (whilst ensuring the container cannot run as root user).

My concern is that having the SA tekton-pipelines-controller given nonroot SCC access, means that the model of "the SA I use will default to restricted SCC unless I configure things differently" is broken by Tekton on OpenShift. This means users need to work with two different styles.

It might also have implications, given that to run via restricted SCC, we chgrp -R 0 /path/to/dir and chmod g=u /path/to/dir our containers for OpenShift restricted SCC compatibility, but we won't be provided with GID=0, unless we run as restricted. It shouldn't be a problem as we can set the runAsUser, but it's just "another thing to be aware of".

namloc2001 commented 3 years ago

I think #492 might provide a resolution/mechanism to answer this.

openshift-bot commented 3 years ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

gabemontero commented 3 years ago

/remove-lifecycle stale

openshift-bot commented 3 years ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot commented 3 years ago

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten /remove-lifecycle stale

openshift-bot commented 3 years ago

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen. Mark the issue as fresh by commenting /remove-lifecycle rotten. Exclude this issue from closing again by commenting /lifecycle frozen.

/close

openshift-ci[bot] commented 3 years ago

@openshift-bot: Closing this issue.

In response to [this](https://github.com/openshift/tektoncd-pipeline-operator/issues/545#issuecomment-902385025): >Rotten issues close after 30d of inactivity. > >Reopen the issue by commenting `/reopen`. >Mark the issue as fresh by commenting `/remove-lifecycle rotten`. >Exclude this issue from closing again by commenting `/lifecycle frozen`. > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

openshift / tektoncd-pipeline-operator

pipelineRun serviceAccount not restricting securityContexts on OpenShift #545

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

Additional Info