rabbitmq / cluster-operator

RabbitMQ Cluster Kubernetes Operator
https://www.rabbitmq.com/kubernetes/operator/operator-overview.html
Mozilla Public License 2.0
864 stars 269 forks source link

Deploy fails on OpenShift 4.13.43 #1692

Closed nick990 closed 1 month ago

nick990 commented 1 month ago

I'm using ROSA with OpenShift version 4.13.43. I've installed the Operator via the Operator Hub.

To Reproduce

  1. Create a new RabbitmqCluster.
  2. The StatefulSet fails to start

The following is the yaml of the cluster:

kind: RabbitmqCluster
apiVersion: rabbitmq.com/v1beta1
metadata:
  name: hello-world
  namespace: dev
spec:
  override:
    statefulset:
      spec:
        template:
          spec:
            containers: []
            securityContext: {}

The following is the error from the StatefulSet:

create Pod hello-world-server-0 in StatefulSet hello-world-server failed error: pods "hello-world-server-0" is forbidden: unable to validate against any security context constraint: [provider "anyuid": Forbidden: not usable by user or serviceaccount, provider restricted-v2: .spec.securityContext.fsGroup: Invalid value: []int64{0}: 0 is not an allowed group, provider restricted-v2: .initContainers[0].runAsUser: Invalid value: 999: must be in the ranges: [1001020000, 1001029999], provider restricted-v2: .containers[0].runAsUser: Invalid value: 999: must be in the ranges: [1001020000, 1001029999], provider "restricted": Forbidden: not usable by user or serviceaccount, provider "nonroot-v2": Forbidden: not usable by user or serviceaccount, provider "nonroot": Forbidden: not usable by user or serviceaccount, provider "pcap-dedicated-admins": Forbidden: not usable by user or serviceaccount, provider "hostmount-anyuid": Forbidden: not usable by user or serviceaccount, provider "elasticsearch-scc": Forbidden: not usable by user or serviceaccount, provider "log-collector-scc": Forbidden: not usable by user or serviceaccount, provider "machine-api-termination-handler": Forbidden: not usable by user or serviceaccount, provider "hostnetwork-v2": Forbidden: not usable by user or serviceaccount, provider "hostnetwork": Forbidden: not usable by user or serviceaccount, provider "hostaccess": Forbidden: not usable by user or serviceaccount, provider "splunkforwarder": Forbidden: not usable by user or serviceaccount, provider "node-exporter": Forbidden: not usable by user or serviceaccount, provider "privileged": Forbidden: not usable by user or serviceaccount]

Version and environment information

olivier-duchaine commented 1 month ago

Hey, you can try to set the init-container security context as well to override the scc set by the operator on it.

kind: RabbitmqCluster
apiVersion: rabbitmq.com/v1beta1
metadata:
  name: hello-world
  namespace: dev
spec:
  override:
    statefulset:
      spec:
        template:
          spec:
            containers: []
            securityContext: {}
            initContainers:
            - name: setup-container
              securityContext: {}         
nick990 commented 1 month ago

@olivier-duchaine Hi, thanks for the suggestion, but unfortunately it doesn't seem to work. I get the same error.

DanielePalaia commented 1 month ago

@nick990 I'm not sure if this is the cause of the issue but our operators are tested against the standard "restricted" SCC while you are using "restricted-v2" which looks like to be even more restricted and could create permission issues like the one you posted: https://docs.openshift.com/container-platform/4.11/authentication/managing-security-context-constraints.html

nick990 commented 1 month ago

@DanielePalaia Is there a way to create a RabbitmqCluster forcing to use the "restricted" SCC?

DanielePalaia commented 1 month ago

@nick990 Actually my mistake, I'm doing some tests now in our OKD cluster (4.14.0-0.okd-2023-12-01-225814) and it is indeed running with restricted-v2 as well (it seems like the restricted-v2 is the default SCC since Openshift 4.11+). But the cluster is running without issue.

Are you sure if maybe this SCC got modified somehow?

when I do kubectl get scc restricted-v2 this is what I have:

dpalaia@C02F541YMD6R infrastructure % kubectl get scc
NAME                              PRIV    CAPS                   SELINUX     RUNASUSER          FSGROUP     SUPGROUP    PRIORITY     READONLYROOTFS   VOLUMES
restricted-v2                     false   ["NET_BIND_SERVICE"]   MustRunAs   MustRunAsRange     MustRunAs   RunAsAny    <no value>   false            ["configMap","csi","downwardAPI","emptyDir","ephemeral","persistentVolumeClaim","projected","secret"]
allowHostDirVolumePlugin: false
allowHostIPC: false
allowHostNetwork: false
allowHostPID: false
allowHostPorts: false
allowPrivilegeEscalation: false
allowPrivilegedContainer: false
allowedCapabilities:
- NET_BIND_SERVICE
apiVersion: security.openshift.io/v1
defaultAddCapabilities: null
fsGroup:
  type: MustRunAs
groups: []
priority: null
readOnlyRootFilesystem: false
requiredDropCapabilities:
- ALL
runAsUser:
  type: MustRunAsRange
seLinuxContext:
  type: MustRunAs
seccompProfiles:
- runtime/default
supplementalGroups:
  type: RunAsAny
users: []
volumes:
- configMap
- csi
- downwardAPI
- emptyDir
- ephemeral
- persistentVolumeClaim
- projected
- secret

You can also test to run with a more relaxed SCC on that cluster for example trying to use this guideline: https://access.redhat.com/articles/6973044

nick990 commented 1 month ago

@DanielePalaia I confirm that restricted-v2 is the default since OpenShift 4.11+.

I don't know if the SCC got modified, how can I verify that? kubectl get scc restricted-v gave me the same result as yours:

NAME            PRIV    CAPS                   SELINUX     RUNASUSER        FSGROUP     SUPGROUP   PRIORITY     READONLYROOTFS   VOLUMES
restricted-v2   false   ["NET_BIND_SERVICE"]   MustRunAs   MustRunAsRange   MustRunAs   RunAsAny   <no value>   false            ["configMap","csi","downwardAPI","emptyDir","ephemeral","persistentVolumeClaim","projected","secret"]
DanielDorado commented 1 month ago

@nick990, you wrote statefulset in the yaml, and you should write statefulSet.

nick990 commented 1 month ago

@DanielDorado Thanks! I confirm the solution proposed by @olivier-duchaine. By overriding the scc it works.

kind: RabbitmqCluster
apiVersion: rabbitmq.com/v1beta1
metadata:
  name: hello-world
  namespace: dev
spec:
  override:
    statefulSet:
      spec:
        template:
          spec:
            containers: []
            securityContext: {}
            initContainers:
            - name: setup-container
              securityContext: {}