Azure / azure-cli-extensions

Public Repository for Extensions of Azure CLI.
https://docs.microsoft.com/en-us/cli/azure
MIT License
382 stars 1.23k forks source link

AKS advanced networking private cluster kubenet will not use existing NSG and route tables #2240

Open mikebranstein opened 4 years ago

mikebranstein commented 4 years ago

Extension name (the extension in question)

aks-preview

Description of issue (in as much detail as possible)

We're attempting to deploy an AKS cluster as an advanced networking private cluster with the kubenet network plugin. This is deployed to an existing virtual network. The docs state it will detect the presence of an existing NSG and route table, then use the existing resources (instead of creating new resources). When run, the CLI does not detect the presence of existing NSG and route table, and attempts to create them in the resource group specified in the command argument. This is undesired and does not work as expected.

My setup:

Command Run:

az aks create 
  --name az-aks 
  --resource-group az-aks-rg 
  --subscription <subscription-id> 
  --location westus2 
  --kubernetes-version 1.17.9 
  --service-principal <service-principal-id> 
  --client-secret <service-principal-client-secret> 
  --enable-private-cluster 
  --network-plugin kubenet 
  --outbound-type userDefinedRouting 
  --vnet-subnet-id "/subscriptions/<subscription-id>/resourceGroups/az-aks-rg/providers/Microsoft.Network/virtualNetworks/az-aks-vnet/subnets/az-nodepool-subnet"
  --service-cidr 172.16.0.0/16 
  --dns-service-ip 172.16.0.10 
  --docker-bridge-address 172.17.0.1/16 
  --vm-set-type VirtualMachineScaleSets 
  --node-count 3 
  --enable-aad
  --aad-admin-group-object-ids <aad-group-id>
  --aad-tenant-id <tenant-id> 

When run, a new BSG and route table are created in the az-aks-rg resource group.

I happened to notice this issue because I have a policy on az-aks-rg that denies the creation of Microsoft.Network/* resource types. See the error below:

Operation failed with status: 'Bad Request'. Details: Provisioning of resource(s) for container service az-aks in resource group az-aks-rg failed. Message: The template deployment failed with multiple errors. Please see details for more information.. Details: [{"code":"RequestDisallowedByPolicy","message":"Resource 'aks-agentpool-42622690-routetable' was disallowed by policy. Policy identifiers: '[{\"policyAssignment\":{\"name\":\"Ensure a select list of Resources are Denied for the AKS Resource Group\",\"id\":\"/subscriptions/subscription-id/resourceGroups/az-aks-rg/providers/Microsoft.Authorization/policyAssignments/AKS-RG-Deny-ResourceTypes\"},\"policyDefinition\":{\"name\":\"Ensure a select list of Resources are Denied for the AKS Resource Group\",\"id\":\"/subscriptions/subscription-id/providers/Microsoft.Authorization/policyDefinitions/AKS-RG-Deny-ResourceTypes\"}}]'.","target":"aks-agentpool-42622690-routetable"},{"code":"RequestDisallowedByPolicy","message":"Resource 'aks-agentpool-42622690-nsg' was disallowed by policy. Policy identifiers: '[{\"policyAssignment\":{\"name\":\"Ensure a select list of Resources are Denied for the AKS Resource Group\",\"id\":\"/subscriptions/subscription-id/resourceGroups/az-aks-rg/providers/Microsoft.Authorization/policyAssignments/AKS-RG-Deny-ResourceTypes\"},\"policyDefinition\":{\"name\":\"Ensure a select list of Resources are Denied for the AKS Resource Group\",\"id\":\"/subscriptions/subscription-id/providers/Microsoft.Authorization/policyDefinitions/AKS-RG-Deny-ResourceTypes\"}}]'.","target":"aks-agentpool-42622690-nsg"}]

Expected results:

ghost commented 4 years ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @Azure/aks-pm.

yonzhan commented 4 years ago

aks-preview

mikebranstein commented 4 years ago

I've begun to dig deeper. When I disable my resource type policy enforcement, AKS deploys successfully, and does not deploy the noted NSG and route table to the az-aks-rg. Instead, the NSG is deployed to the new RG that is created to house the node pool VMSS. The existing route table was also used.

Based on this new information, I believe this issue is a validation issue - the template being submitted for deployment is trying to validate that the NSG and route table are being created in the wrong resource group, but when it actually deploys, they're not deployed there.

I also do not believe this is a CLI-specific error, as I attempted to deploy AKS using Terraform using the exact same configuration/values, and received the same error noted above.

navba-MSFT commented 2 years ago

@mikebranstein Apologies for the late reply. You can refer this article for the existing subnet and route table with kubenet and its limitations. This issue is open for quite sometime. Could you please let us know if you still need assistance on this issue ? Awaiting your reply.

ghost commented 2 years ago

Hi, we're sending this friendly reminder because we haven't heard back from you in a while. We need more information about this issue to help address it. Please be sure to give us your input within the next 7 days. If we don't hear back from you within 14 days of this comment the issue will be automatically closed. Thank you!

mikebranstein commented 2 years ago

@navba-MSFT Thank you. The issue I'm reporting is that the AKS CLI isn't using my existing NSG and Route Table when using the --vnet-subnet-id CLI option. The CLI tries to create a new NSG and Route Table.

This may have changed since the original post. I will retest and validate so this can be closed out.

ghost commented 1 year ago

@navba-MSFT looks like this problem was solved for az cli, but still present for ARM templates and bicep. If I have policy that prevent route tables creation except 1 that connected to subnet, then validation will fail with same error: Resource 'aks-agentpool--routetable' was disallowed by policy.