open-policy-agent / gatekeeper

🐊 Gatekeeper - Policy Controller for Kubernetes
https://open-policy-agent.github.io/gatekeeper/
Apache License 2.0
3.63k stars 744 forks source link

Cant catch OpenShift object (just seems actually NOT) #2013

Closed aitchjoe closed 2 years ago

aitchjoe commented 2 years ago

RESOLVED: check the third comment.

What steps did you take and what happened:

I have created a constraint which enforce user to use Kubernetes Deployment instead of OpenShift DeploymentConfig, but it doesn't work. I have modified the following block-deploymentconfig.yaml and remove the kinds filter, and I can see the warning when create Deployment / Service / PVC etc, but not with OpenShift Deployment / Route etc.

D:\>oc get apiservices
NAME                                                 SERVICE                                                      AVAILABLE   AGE
v1.                                                  Local                                                        True        555d
v1.admissionregistration.k8s.io                      Local                                                        True        555d
v1.apps                                              Local                                                        True        555d
v1.apps.openshift.io                                 openshift-apiserver/api                                      True        555d
v1.batch                                             Local                                                        True        555d
v1.build.openshift.io                                openshift-apiserver/api                                      True        555d
v1.image.openshift.io                                openshift-apiserver/api                                      True        555d
v1.network.openshift.io                              Local                                                        True        6h35m
v1.networking.k8s.io                                 Local                                                        True        555d
v1.route.openshift.io                                openshift-apiserver/api                                      True        555d
......

The difference is that api groups's SERVICE is Local or not (checked some not all). So I think this maybe is not Gatekeeper's problem, but I havent found anything from Kubernetes or OpenShift side.

What did you expect to happen:

User should see the warning.

Anything else you would like to add:

gatekeeper-validating-webhook-configuration.yaml

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
  labels:
    app: 'gatekeeper'
    chart: 'gatekeeper'
    gatekeeper.sh/system: "yes"
    heritage: 'Helm'
    release: 'gatekeeper'
  name: gatekeeper-validating-webhook-configuration
webhooks:
- admissionReviewVersions:
  - v1
  - v1beta1
  clientConfig:
    service:
      name: gatekeeper-webhook-service
      namespace: 'infra-gatekeeper'
      path: /v1/admit
  failurePolicy: Ignore
  matchPolicy: Exact
  name: validation.gatekeeper.sh
  namespaceSelector:
    matchExpressions:
    - key: admission.gatekeeper.sh/ignore
      operator: DoesNotExist
  rules:
  - apiGroups:
    - '*'
    apiVersions:
    - '*'
    operations: 
    - CREATE
    - UPDATE
    resources:
    - '*'
    scope: "Namespaced"
  sideEffects: None
  timeoutSeconds: 3

block-overlapped-openshift-object.yaml

apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
  name: k8sblockoverlappedopenshiftobject
  annotations:
    description: >-
      Try Kubernetes Deployment.
spec:
  crd:
    spec:
      names:
        kind: K8sBlockOverlappedOpenshiftObject
      validation:
        openAPIV3Schema:
          type: object
          properties:
            recommendation:
              description: Recommendation object.
              type: string
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package k8sblockoverlappedopenshiftobject

        violation[{"msg": msg}] {
          msg := sprintf("Dont use %v, try %v", [input.review.kind.kind, input.parameters.recommendation])
        }

block-deploymentconfig.yaml

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sBlockOverlappedOpenshiftObject
metadata:
  name: block-deploymentconfig
spec:
  enforcementAction: warn
  match:
    kinds:
      - apiGroups: ["apps.openshift.io"]
        kinds: ["DeploymentConfig"]
  parameters:
    recommendation: "Deployment"

test-dc.yaml

apiVersion: apps.openshift.io/v1
kind: DeploymentConfig
metadata:
  name: test-dc
spec:
  selector:
    app: no-this-app
  replicas: 1
  template:
    metadata:
      labels:
        app: no-this-app
    spec:
      containers:
        - name: httpd
          image: >-
            docker.io/httpd:latest
          ports:
            - containerPort: 8080

Environment:

aitchjoe commented 2 years ago

Sorry I cant remove the bug label.

aitchjoe commented 2 years ago

I have modified the webhook config in gatekeeper-validating-webhook-configuration.yaml:

webhooks:
- admissionReviewVersions:
  - v1
  - v1beta1
  clientConfig:
    service:
      name: gatekeeper-webhook-service
      namespace: 'infra-gatekeeper'
      path: /v1/admit
  failurePolicy: Fail
  matchPolicy: Exact
  name: validation.gatekeeper.sh
  namespaceSelector:
    matchExpressions:
    - key: admission.gatekeeper.sh/ignore
      operator: DoesNotExist
  rules:
  - apiGroups:
    - 'apps.openshift.io'
    apiVersions:
    - 'v1'
    operations: 
    - CREATE
    - UPDATE
    resources:
    - 'deploymentconfigs'
    scope: "Namespaced"
  sideEffects: None
  timeoutSeconds: 10

I hava changed:

then create dc:

Error from server (InternalError): error when creating "test-dc.yaml": Internal error occurred: failed calling webhook "validation.gatekeeper.sh": Post "https://gatekeeper-webhook-service.infra-gatekeeper.svc:443/v1/admit?timeout=10s": dial tcp 172.50.106.79:443: i/o timeout

So the problem is NOT "Cant catch OpenShift object", it DID catch, but when some internal error happened, failurePolicy: Ignore make it like not catched.

On OpenShift side, there are two apiserver: openshift-apiserver and openshift-kube-apiserver, I found the timeout error in both them, and make sure telnet gatekeeper-webhook-service.infra-gatekeeper.svc 443 not worked from these namespace, so I create a network policy in gatekeeper namespace and allow two apiserver namespace can telnet the service in gatekeeper namespace:

kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  name: allow-from-openshift-apiserver
spec:
  podSelector: {}
  ingress:
    - from:
        - namespaceSelector:
            matchExpressions:
              - key: kubernetes.io/metadata.name
                operator: In
                values: ["openshift-apiserver", "openshift-kube-apiserver"]
  policyTypes:
    - Ingress

Then everything ok. But I still dont know why it is ok to create Kubernetes object such as Service / PVC before these change?