Azure / AKS

Azure Kubernetes Service
https://azure.github.io/AKS/
1.95k stars 305 forks source link

[Breaking Change] Required Azure Policy add-on upgrade from Policy_v1 to Policy_v2 addon #1606

Closed RamyasreeChakka closed 3 years ago

RamyasreeChakka commented 4 years ago

Update to Azure Policy Addon (#1488)

The Azure Policy Add-on for AKS has released a new version to integrate with OPA Gatekeeper v3. For detailed instructions for enabling Azure Policy Add-on on AKS, please visit Understand Azure Policy for Kubernetes clusters.

Impact

If you used Azure Policy addon (v1) during the limited preview, it was installed with OPA and GK v2. This has been updated with a new version of the addon (v2) and requires action by the customer to move to the new format. Policies also differ between v1 and v2 usage.

How to update existing Policy preview installs

To update an existing cluster's Azure Policy Add-on to new version, disable the add-on with az aks disable-addons and then re-enable with az aks enable-addons.

bhicks329 commented 4 years ago

What's the current status of the Policy Add-on feature for AKS? Are there any ETAs?

RamyasreeChakka commented 4 years ago

@bhicks329 The current ETA for completing the deployment in all regions is 5/21.

jluk commented 4 years ago

To add some color, the previous state of the Policy Addon was a limited preview, which required approval from MSFT to enter. With this release which is partially rolled out, the Policy Addon is fully public and you can auto-enroll yourself in usage. In addition it contains the latest capabilities of GKv3, which is captured in #1488.

neumanndaniel commented 4 years ago

@RamyasreeChakka @jluk I just enrolled the new version of the add-on to my AKS cluster in North Europe region.

Works fine and I get the v2 statement in the JSON output of az CLI.

  "addonProfiles": {
    "azurepolicy": {
      "config": {
        "version": "v2"
      },
      "enabled": true,
      "identity": null
    },

But the Azure Policy Pod in the kube-system namespace is missing.

According to the docs it should be there. (https://docs.microsoft.com/en-us/azure/governance/policy/concepts/policy-for-kubernetes?toc=/azure/aks/toc.json#install-azure-policy-add-on-for-aks)

Is this due to the ongoing deployment/rollout to all regions?

RamyasreeChakka commented 4 years ago

@neumanndaniel Thanks for trying Azure Policy Add-on and reporting the issue here. The new version of add-on is available in North Europe region and expect it to work. Can you please tell us your cluster details(cluster resource ID)? We will investigate and get back to you.

If the add-on is installed properly, you should see azure-policy-xxx pod in kube-system namespace like below... NAME READY STATUS RESTARTS AGE azure-policy-5cddd9465-kbhdk 1/1 Running 0 7h1m

neumanndaniel commented 4 years ago

@RamyasreeChakka

Resource ID is /subscriptions/fe96473f-ec11-45cb-be64-e7343f59efeb/resourcegroups/azst-aks2/providers/Microsoft.ContainerService/managedClusters/azst-aks2

Azure Policy Pod still missing. I have the same issue on another AKS cluster in another subscription.

r-t-m commented 4 years ago

@RamyasreeChakka

We are using AKS with Azure Policy and PSP enabled. With this update rolling out it breaks any cluster deployment with that combo. Gatekeeper v3 doesn't create psp/role/rolebinding for itself and deployment just getting stuck:

Warning  FailedCreate  115s (x20 over 40m)  replicaset-controller  Error creating: pods "gatekeeper-controller-manager-d5cd87796-" is forbidden: unable to validate against any pod security policy: []

This prevents any subsequent cluster configuration like namespace creation until the gatekeeper issue is resolved by applying role/rolebinding to use priviledged psp or custom one.

$ kubectl create namespace test1

Error from server (InternalError): Internal error occurred: failed calling webhook "check-ignore-label.gatekeeper.sh": Post https://gatekeeper-webhook-service.gatekeeper-system.svc:443/v1/admitlabel?timeout=5s: dial tcp 10.0.128.161:443: connect: connection refused
RamyasreeChakka commented 4 years ago

@r-t-m Thanks for reporting the issue. We are working on the fix now.

@neumanndaniel Thanks for cluster details, we are investigating the issue.

neumanndaniel commented 4 years ago

@RamyasreeChakka FYI. I redeployed my other AKS cluster and that solved the issue.

ghost commented 4 years ago

Action required from @Azure/aks-pm

jluk commented 3 years ago

Closing this issue, if new problems arise for users moving from v1 to v2 of Azure Policy just leave a comment and we will revisit.