GalleyBytes / terraform-operator

A Kubernetes CRD to handle terraform operations
http://tf.galleybytes.com
Apache License 2.0
357 stars 47 forks source link

securityContext set on worker pod making it difficult to use on OpenShift #126

Open callum-stakater opened 1 year ago

callum-stakater commented 1 year ago

Hi There!

Recently you merged a PR I opened to make the securityContext: configurable for the controller which greatly simplifies deploying the operator to OpenShift clusters, that worked well but issue I have now is the same securityContext is set by the controller on the workflow pod that it spawns to run the tf workflow

Is there anyway you can make that configurable or maybe inherit the securityContext from the controller?

The reason this is an issue on OpenShift is that OpenShift has this concept of SecurityContextConstraints and defining an arbitrary runAsUser is a privileged action, it is possible to get around this by creating a dedicated SCC for the service account the terraform-operator users for the workflow pod, granting it access to use "Any Run as user" but better practice is to not set that value which allows OpenShift to set its own randomly allocated UID from the range assigned to the particular node

2022-11-15T14:42:30.080Z    DEBUG   controller-runtime.manager.events   Warning {"object": {"kind":"Terraform","namespace":"stakater-terraform","name":"simple-template-example","uid":"6561f384-b5ec-448e-a01a-10aeda23b2e9","apiVersion":"tf.isaaguilar.com/v1alpha2","resourceVersion":"1388221816"}, "reason": "PodCreateError", "message": "Could not create Pod pods \"simple-template-example-9oz3h3r3-v3-setup-\" is forbidden: unable to validate against any security context constraint: [provider \"anyuid\": Forbidden: not usable by user or serviceaccount, provider \"pipelines-scc\": Forbidden: not usable by user or serviceaccount, provider \"tekton-pipelines-scc\": Forbidden: not usable by user or serviceaccount, provider restricted: .spec.securityContext.fsGroup: Invalid value: []int64{2002022-11-15T14:42:30.081602675Z 0}: 2000 is not an allowed group, spec.containers[0].securityContext.runAsUser: Invalid value: 2000: must be in the ranges: [1001880000, 1001889999], provider \"nonroot\": Forbidden: not usable by user or serviceaccount, provider \"noobaa\": Forbidden: not usable by user or serviceaccount, provider \"noobaa-endpoint\": Forbidden: not usable by user or serviceaccount, provider \"scc-kubecost\": Forbidden: not usable by user or serviceaccount, provider \"sonardb-scc\": Forbidden: not usable by user or serviceaccount, provider \"hostmount-anyuid\": Forbidden: not usable by user or serviceaccount, provider \"log-collector-scc\": Forbidden: not usable by user or serviceaccount, provider \"apps-fluentd-scc\": Forbidden: not usable by user or serviceaccount, provider \"iam-scc\": Forbidden: not usable by user or serviceaccount, provider \"stakater-managed-openshift-apps-keycloak-scc\": Forbidden: not usable by user or serviceaccount, provider \"machine-api-termination-handler\": Forbidden: not usable by user or serviceaccount, provider \"hostnetwork\": Forbidden: not usable by user or serviceaccount, provider \"hostaccess\": Forbidden: not usable by user or serviceaccount, provider \"rook-ceph\": Forbidden: not usable by user or serviceaccount, provider \"node-exporter\": Forbidden: not usable by user or serviceaccount, provider \"privileged\": Forbidden: not usable by user or serviceaccount, provider \"rook-ceph-csi\": Forbidden: not usable by user or serviceaccount]"}
callum-stakater commented 1 year ago

actually looking a bit closer at the error it is also flagged for the .spec.securityContext.fsGroup: 2000 but same concept described above applies here

isaaguilar commented 1 year ago

Hi @callum-stakater I'm sorry your having this issue. I'm not familiar with OpenShift or the concept of SecurityContextConstraints. The usage for the security context with user/group 2000/2000 is actually just to solve an issue with multiple pods that share a PersistentVolume.

The id 2000 is a completely arbitrary number selection. However, the workflow has to guarantee that all the task pods get the same exact user/group. Those ids are used to mount the volume to make it readable and writeable. And the mounted volume contains all the data for a workflow to execute.

Is it just fsGroup 2000 that can't be used? What if the id was definable in the tf resource spec which gets applied to all task pods?

callum-stakater commented 1 year ago

Not a problem, is the constant battle in the world of OpenShift administration :)

The number 2000 isn't important here, the issue is as OpenShift does "Secure by default" it doesn't allow setting the runAsUser or fsGroup without adding the additional scc configurations because if you are able to set any number you can set root (0), if you can set root and your container gets compromised its a bad day, so the default is for OpenShift to assign random IDs from a high range that each node has a pool of, and forces the administrator to jump through hoops to set their own IDs under the assumption they then know what they are doing and the risks involved. (Basically to stop us running random stuff off the internet that runs as root :) )

In this case though as the fsGroup/runAsUser is defined in the operator code , for a reason (shared volumes) its perfectly fine and a good practice for the vanilla kubernetes users so we will extend the deployment/chart and add the scc's we need

Though it does make the last PR we cut on the charts repo a bit redundant and misleading for future openshifters that end up here

dan1el-k commented 1 year ago

We are also running the operator on OKD/Openshift. We usually handle this by adding the "anyuid" scc to the existing ClusterRole (via a kustomize patch without need of changing the upstream manifests).

- target:
    group: rbac.authorization.k8s.io
    version: v1
    kind: ClusterRole
  patch: |-
    - op: add
      path: /rules/-
      value:
        apiGroups:
        - security.openshift.io
        resources:
        - securitycontextconstraints
        verbs:
        - use
        resourceNames:
        - anyuid

or alternatively by adding the required User/Group directly to the namespace. This ommits to permit any SCC or create a new one as it fixes the range for users/groups to be used in that namespace. Especially if its only 1 namespace where you run the operator and the workflows itself, this is probably the easiest solution and also doesn't contradict the SCC concept.

kind: Namespace
apiVersion: v1
metadata:
  name: terraform-operator
  annotations:
    openshift.io/sa.scc.supplemental-groups: 2000/1
    openshift.io/sa.scc.uid-range: 2000/1

Perhaps this helps.

callum-stakater commented 1 year ago

Ah nice, wasn’t aware of the annotation method, that does help

On Tue 13. 12. 2022 at 8:25, Daniel K @.***> wrote:

We are also running the operator on OKD/Openshift. We usually handle this by adding the "anyuid" scc to the existing ClusterRole (via a kustomize patch without need of changing the upstream manifests). ´´´yaml

  • target: group: rbac.authorization.k8s.io version: v1 kind: ClusterRole patch: |-
    • op: add path: /rules/- value: apiGroups:
      • security.openshift.io resources:
      • securitycontextconstraints verbs:
      • use resourceNames:
      • anyuid

or alternatively by adding the required User/Group directly to the namespace. This ommits to permit any SCC or create a new one as it fixes the range for users/groups to be used in that namespace.

Especially if its only 1 namespace where you run the operator and the workflows itself, this is probably the easiest solution and also doesn't contradict the SCC concept.



kind: Namespace

apiVersion: v1

metadata:

  name: terraform-operator

  annotations:

    openshift.io/sa.scc.supplemental-groups: 2000/1

    openshift.io/sa.scc.uid-range: 2000/1

Perhaps this helps.

—
Reply to this email directly, view it on GitHub
<https://github.com/isaaguilar/terraform-operator/issues/126#issuecomment-1347853865>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ARVLIF5S6J6SMU5AUU32SX3WNAQGTANCNFSM6AAAAAASBA57CU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>