linode / apl-core

Akamai App Platform for LKE
https://apl-docs.net
Apache License 2.0
2.22k stars 168 forks source link

Installation (and deletion) jobs fail as a result of Azure Policies #1316

Closed richardvdveen closed 1 year ago

richardvdveen commented 1 year ago

Describe the bug: (a clear and concise description of what the bug is)

This might not be considered a bug, but it limits the automatic installation of Otomi on AKS clusters with Azure Policies enabled. The issue is that it seems that the installation (and deletion) jobs that Otomi creates to install and remove Otomi depend on the provisioning of containers with privilege escalation. Right now, I need to go and disable the Azure Policy on the AKS resource and then run the installation job(which works fine). Maybe there is a good reason for this, but if not I would suggest to remove the privilege escalation.

To Reproduce Steps to reproduce the behavior:

  1. Provision an AKS cluster in Microsoft Azure
  2. Have Azure Policies that prevent Privilege escalation in containers
  3. Install Otomi using the following command: helm install otomi otomi/otomi --set cluster.name=aks-iota --set cluster.provider=azure
  4. The installation job will start, but it will fail to create due to a requirement for a privilige escalation container: Error creating: admission webhook "validation.gatekeeper.sh" denied the request: [azurepolicy-k8sazurev3noprivilegeescalatio-e1c9a8299f4f47efd9a5] Privilege escalation container is not allowed: install

Expected behavior: (a clear and concise description of what you expected to happen)

I would expect to be able to install Otomi without the requirement for privilege escalation on the containers used by the Otomi installation and deletion jobs.

Screenshots: (if applicable, add screenshots to help explain your problem)

Name:             otomi
Namespace:        default
Selector:         controller-uid=d10e75ef-decf-4851-867f-d5a376c701b9
Labels:           app.kubernetes.io/instance=otomi
                  app.kubernetes.io/managed-by=Helm
                  app.kubernetes.io/name=otomi
                  app.kubernetes.io/version=v0.26.2
                  helm.sh/chart=otomi-0.5.54
Annotations:      batch.kubernetes.io/job-tracking: 
                  meta.helm.sh/release-name: otomi
                  meta.helm.sh/release-namespace: default
Parallelism:      1
Completions:      1
Completion Mode:  NonIndexed
Start Time:       Tue, 10 Oct 2023 14:06:49 +0200
Pods Statuses:    0 Active (0 Ready) / 0 Succeeded / 0 Failed
Pod Template:
  Labels:           app.kubernetes.io/instance=otomi
                    app.kubernetes.io/name=otomi
                    controller-uid=d10e75ef-decf-4851-867f-d5a376c701b9
                    job-name=otomi
  Service Account:  otomi
  Containers:
   install:
    Image:      otomi/core:v0.26.2
    Port:       <none>
    Host Port:  <none>
    Command:
      bash
      -c
    Args:
      kubectl create ns otomi &> /dev/null
      binzx/otomi validate-cluster && binzx/otomi bootstrap && binzx/otomi apply

    Limits:
      cpu:     2
      memory:  2Gi
    Requests:
      cpu:     1
      memory:  1Gi
    Environment:
      VERBOSITY:     1
      ENV_DIR:       /home/app/stack/env
      VALUES_INPUT:  /secret/values.yaml
    Mounts:
      /home/app/stack/env from otomi-values (rw)
      /secret from values-secret (rw)
  Volumes:
   values-secret:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  otomi-values
    Optional:    false
   otomi-values:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
Events:
  Type     Reason        Age                 From            Message
  ----     ------        ----                ----            -------
  Warning  FailedCreate  76s (x4 over 2m6s)  job-controller  Error creating: admission webhook "validation.gatekeeper.sh" denied the request: [azurepolicy-k8sazurev3noprivilegeescalatio-e1c9a8299f4f47efd9a5] Privilege escalation container is not allowed: install

Cluster(s):

j-zimnowoda commented 1 year ago

Great input @richardvdveen I think it is related to securityContext that should be improved.