Azure / azure-cli

Azure Command-Line Interface
MIT License
4.03k stars 3.01k forks source link

az aks command invoke fails with KubernetesPerformanceError with AAD + RBAC-enabled cluster #27776

Open RoFz opened 1 year ago

RoFz commented 1 year ago

Describe the bug

az aks command invoke fails with KubernetesPerformanceError with an AAD + RBAC-enabled cluster

Related command

az aks command invoke -g $rgname -n $aksname --command "kubectl get ns"

Errors

ERROR: (KubernetesPerformanceError) Failed to run command due to cluster perf issue, container command-8021652d809147db8d927a42ae8f626c in aks-command namespace did not start within 30s on your cluster, retry may helps. If issue persist, you may need to tune your cluster with better performance (larger node/paid tier).
Code: KubernetesPerformanceError
Message: Failed to run command due to cluster perf issue, container command-8021652d809147db8d927a42ae8f626c in aks-command namespace did not start within 30s on your cluster, retry may helps. If issue persist, you may need to tune your cluster with better performance (larger node/paid tier).

Issue script & Debug output

DEBUG: cli.knack.log: File logging enabled - writing logs to '/Users/<suppressed>/azureclilog'.
DEBUG: cli.knack.cli: Command arguments: ['aks', 'command', 'invoke', '-g', 'rg-<suppressed>-playground-d-datadog', '-n', 'aks-<suppressed>-d-datadog-001', '--command', 'kubectl get ns', '--debug']
DEBUG: cli.knack.cli: __init__ debug log:
Color is disabled by config.
DEBUG: cli.knack.cli: Event: Cli.PreExecute []
DEBUG: cli.knack.cli: Event: CommandParser.OnGlobalArgumentsCreate [<function CLILogging.on_global_arguments at 0x1059f2560>, <function OutputProducer.on_global_arguments at 0x105a87520>, <function CLIQuery.on_global_arguments at 0x105af49d0>]
DEBUG: cli.knack.cli: Event: CommandInvoker.OnPreCommandTableCreate []
DEBUG: cli.azure.cli.core: Modules found from index for 'aks': ['azure.cli.command_modules.acs']
DEBUG: cli.azure.cli.core: Loading command modules:
DEBUG: cli.azure.cli.core: Name                  Load Time    Groups  Commands
DEBUG: cli.azure.cli.core: acs                       0.035         7        54
DEBUG: cli.azure.cli.core: Total (1)                 0.035         7        54
DEBUG: cli.azure.cli.core: These extensions are not installed and will be skipped: ['azext_ai_examples', 'azext_next']
DEBUG: cli.azure.cli.core: Loading extensions:
DEBUG: cli.azure.cli.core: Name                  Load Time    Groups  Commands  Directory
DEBUG: cli.azure.cli.core: Total (0)                 0.000         0         0  
DEBUG: cli.azure.cli.core: Loaded 7 groups, 54 commands.
DEBUG: cli.azure.cli.core: Found a match in the command table.
DEBUG: cli.azure.cli.core: Raw command  : aks command invoke
DEBUG: cli.azure.cli.core: Command table: aks command invoke
DEBUG: cli.knack.cli: Event: CommandInvoker.OnPreCommandTableTruncate [<function AzCliLogging.init_command_file_logging at 0x10659edd0>]
DEBUG: cli.azure.cli.core.azlogging: metadata file logging enabled - writing logs to '/Users/<suppressed>/.azure/commands/2023-11-06.14-45-56.aks_command_invoke.65251.log'.
INFO: az_command_data_logger: command args: aks command invoke -g {} -n {} --command {} --debug
DEBUG: cli.knack.cli: Event: CommandInvoker.OnPreArgumentLoad [<function register_global_subscription_argument.<locals>.add_subscription_parameter at 0x1065bb7f0>]
DEBUG: cli.knack.cli: Event: CommandInvoker.OnPostArgumentLoad []
DEBUG: cli.knack.cli: Event: CommandInvoker.OnPostCommandTableCreate [<function register_ids_argument.<locals>.add_ids_arguments at 0x106699360>, <function register_cache_arguments.<locals>.add_cache_arguments at 0x106699480>]
DEBUG: cli.knack.cli: Event: CommandInvoker.OnCommandTableLoaded []
DEBUG: cli.knack.cli: Event: CommandInvoker.OnPreParseArgs []
DEBUG: cli.knack.cli: Event: CommandInvoker.OnPostParseArgs [<function OutputProducer.handle_output_argument at 0x105a875b0>, <function CLIQuery.handle_query_parameter at 0x105af4a60>, <function register_ids_argument.<locals>.parse_ids_arguments at 0x1066993f0>]
DEBUG: cli.azure.cli.core.commands.client_factory: Getting management service client client_type=ContainerServiceClient
DEBUG: cli.azure.cli.core.auth.persistence: build_persistence: location='/Users/<suppressed>/.azure/msal_token_cache.json', encrypt=False
DEBUG: cli.azure.cli.core.auth.binary_cache: load: /Users/<suppressed>/.azure/msal_http_cache.bin
DEBUG: urllib3.util.retry: Converted retries value: 1 -> Retry(total=1, connect=None, read=None, redirect=None, status=None)
DEBUG: msal.authority: openid_config = {'token_endpoint': 'https://login.microsoftonline.com/<suppressed>/oauth2/v2.0/token', 'token_endpoint_auth_methods_supported': ['client_secret_post', 'private_key_jwt', 'client_secret_basic'], 'jwks_uri': 'https://login.microsoftonline.com/<suppressed>/discovery/v2.0/keys', 'response_modes_supported': ['query', 'fragment', 'form_post'], 'subject_types_supported': ['pairwise'], 'id_token_signing_alg_values_supported': ['RS256'], 'response_types_supported': ['code', 'id_token', 'code id_token', 'id_token token'], 'scopes_supported': ['openid', 'profile', 'email', 'offline_access'], 'issuer': 'https://login.microsoftonline.com/<suppressed>/v2.0', 'request_uri_parameter_supported': False, 'userinfo_endpoint': 'https://graph.microsoft.com/oidc/userinfo', 'authorization_endpoint': 'https://login.microsoftonline.com/<suppressed>/oauth2/v2.0/authorize', 'device_authorization_endpoint': 'https://login.microsoftonline.com/<suppressed>/oauth2/v2.0/devicecode', 'http_logout_supported': True, 'frontchannel_logout_supported': True, 'end_session_endpoint': 'https://login.microsoftonline.com/<suppressed>/oauth2/v2.0/logout', 'claims_supported': ['sub', 'iss', 'cloud_instance_name', 'cloud_instance_host_name', 'cloud_graph_host_name', 'msgraph_host', 'aud', 'exp', 'iat', 'auth_time', 'acr', 'nonce', 'preferred_username', 'name', 'tid', 'ver', 'at_hash', 'c_hash', 'email'], 'kerberos_endpoint': 'https://login.microsoftonline.com/<suppressed>/kerberos', 'tenant_region_scope': 'EU', 'cloud_instance_name': 'microsoftonline.com', 'cloud_graph_host_name': 'graph.windows.net', 'msgraph_host': 'graph.microsoft.com', 'rbac_url': 'https://pas.windows.net'}
DEBUG: msal.application: Broker enabled? False
DEBUG: cli.azure.cli.core.auth.credential_adaptor: CredentialAdaptor.get_token: scopes=('https://management.core.windows.net//.default',), kwargs={}
DEBUG: cli.azure.cli.core.auth.msal_authentication: UserCredential.get_token: scopes=('https://management.core.windows.net//.default',), claims=None, kwargs={}
DEBUG: msal.application: Cache hit an AT
DEBUG: msal.telemetry: Generate or reuse correlation_id: f4d3657a-c9fb-4ea2-b220-9016f6879b4d
DEBUG: cli.azure.cli.core.sdk.policies: Request URL: 'https://management.azure.com/subscriptions/<suppressed>/resourceGroups/rg-<suppressed>-playground-d-datadog/providers/Microsoft.ContainerService/managedClusters/aks-<suppressed>-d-datadog-001?api-version=2023-07-01'
DEBUG: cli.azure.cli.core.sdk.policies: Request method: 'GET'
DEBUG: cli.azure.cli.core.sdk.policies: Request headers:
DEBUG: cli.azure.cli.core.sdk.policies:     'Accept': 'application/json'
DEBUG: cli.azure.cli.core.sdk.policies:     'x-ms-client-request-id': '3169f6d0-7cb3-11ee-9d12-5e4835e9c80c'
DEBUG: cli.azure.cli.core.sdk.policies:     'CommandName': 'aks command invoke'
DEBUG: cli.azure.cli.core.sdk.policies:     'ParameterSetName': '-g -n --command --debug'
DEBUG: cli.azure.cli.core.sdk.policies:     'User-Agent': 'AZURECLI/2.53.0 (HOMEBREW) azsdk-python-azure-mgmt-containerservice/26.0.0 Python/3.10.13 (macOS-13.6.1-arm64-arm-64bit)'
DEBUG: cli.azure.cli.core.sdk.policies:     'Authorization': '*****'
DEBUG: cli.azure.cli.core.sdk.policies: Request body:
DEBUG: cli.azure.cli.core.sdk.policies: This request has no body
DEBUG: urllib3.connectionpool: Starting new HTTPS connection (1): management.azure.com:443
DEBUG: urllib3.connectionpool: https://management.azure.com:443 "GET /subscriptions/<suppressed>/resourceGroups/rg-<suppressed>-playground-d-datadog/providers/Microsoft.ContainerService/managedClusters/aks-<suppressed>-d-datadog-001?api-version=2023-07-01 HTTP/1.1" 200 None
DEBUG: cli.azure.cli.core.sdk.policies: Response status: 200
DEBUG: cli.azure.cli.core.sdk.policies: Response headers:
DEBUG: cli.azure.cli.core.sdk.policies:     'Cache-Control': 'no-cache'
DEBUG: cli.azure.cli.core.sdk.policies:     'Pragma': 'no-cache'
DEBUG: cli.azure.cli.core.sdk.policies:     'Transfer-Encoding': 'chunked'
DEBUG: cli.azure.cli.core.sdk.policies:     'Content-Type': 'application/json'
DEBUG: cli.azure.cli.core.sdk.policies:     'Content-Encoding': 'gzip'
DEBUG: cli.azure.cli.core.sdk.policies:     'Expires': '-1'
DEBUG: cli.azure.cli.core.sdk.policies:     'Vary': 'Accept-Encoding'
DEBUG: cli.azure.cli.core.sdk.policies:     'x-ms-routing-request-id': 'FRANCESOUTH:20231106T144557Z:5c18bcbf-cc15-4ced-9e75-b02c235130fb'
DEBUG: cli.azure.cli.core.sdk.policies:     'x-ms-ratelimit-remaining-subscription-reads': '11998'
DEBUG: cli.azure.cli.core.sdk.policies:     'x-ms-correlation-request-id': '5c18bcbf-cc15-4ced-9e75-b02c235130fb'
DEBUG: cli.azure.cli.core.sdk.policies:     'x-ms-request-id': '848e46ee-ee21-421d-9cb4-ef6113697ee9'
DEBUG: cli.azure.cli.core.sdk.policies:     'Strict-Transport-Security': 'max-age=31536000; includeSubDomains'
DEBUG: cli.azure.cli.core.sdk.policies:     'X-Content-Type-Options': 'nosniff'
DEBUG: cli.azure.cli.core.sdk.policies:     'Date': 'Mon, 06 Nov 2023 14:45:56 GMT'
DEBUG: cli.azure.cli.core.sdk.policies: Response content:
DEBUG: cli.azure.cli.core.sdk.policies: {
  "id": "/subscriptions/<suppressed>/resourcegroups/rg-<suppressed>-playground-d-datadog/providers/Microsoft.ContainerService/managedClusters/aks-<suppressed>-d-datadog-001",
  "location": "westeurope",
  "name": "aks-<suppressed>-d-datadog-001",
  "type": "Microsoft.ContainerService/ManagedClusters",
  "properties": {
   "provisioningState": "Succeeded",
   "powerState": {
    "code": "Running"
   },
   "kubernetesVersion": "1.26.6",
   "currentKubernetesVersion": "1.26.6",
   "dnsPrefix": "aks-z004ct-rg-<suppressed>-play-e52deb",
   "azurePortalFQDN": "9240a6210d9887658f537b407d737b22-priv.portal.hcp.westeurope.azmk8s.io",
   "privateFQDN": "aks-z004ct-rg-<suppressed>-play-e52deb-u8ezsebo.9c4bd865-256d-430a-82a9-d200addcc9aa.privatelink.westeurope.azmk8s.io",
   "agentPoolProfiles": [
    {
     "name": "nodepool1",
     "count": 1,
     "vmSize": "Standard_D2d_v5",
     "osDiskSizeGB": 32,
     "osDiskType": "Ephemeral",
     "kubeletDiskType": "OS",
     "vnetSubnetID": "/subscriptions/<suppressed>/resourceGroups/rg-<suppressed>-playground-d-datadog/providers/Microsoft.Network/virtualNetworks/vnet-<suppressed>-d-datadog/subnets/snet-datadog-aks",
     "maxPods": 250,
     "type": "VirtualMachineScaleSets",
     "enableAutoScaling": false,
     "provisioningState": "Succeeded",
     "powerState": {
      "code": "Running"
     },
     "orchestratorVersion": "1.26.6",
     "currentOrchestratorVersion": "1.26.6",
     "enableNodePublicIP": false,
     "mode": "System",
     "enableEncryptionAtHost": true,
     "enableUltraSSD": false,
     "osType": "Linux",
     "osSKU": "Ubuntu",
     "nodeImageVersion": "AKSUbuntu-2204gen2containerd-202310.04.0",
     "upgradeSettings": {},
     "enableFIPS": false
    }
   ],
   "linuxProfile": {
    "adminUsername": "rofz",
    "ssh": {
     "publicKeys": [
      {
       "keyData": "ssh-rsa <suppressed> <suppressed>@<suppressed>\n"
      }
     ]
    }
   },
   "windowsProfile": {
    "adminUsername": "azureuser",
    "enableCSIProxy": true
   },
   "servicePrincipalProfile": {
    "clientId": "msi"
   },
   "addonProfiles": {
    "azurepolicy": {
     "enabled": true,
     "config": null,
     "identity": {
      "resourceId": "/subscriptions/<suppressed>/resourcegroups/rg-<suppressed>-playground-d-datadog.aks/providers/Microsoft.ManagedIdentity/userAssignedIdentities/azurepolicy-aks-<suppressed>-d-datadog-001",
      "clientId": "448b1cf7-7ab9-4fcc-93f8-9b830a6682c8",
      "objectId": "65d33217-5bb6-4801-b836-5e7e273ebb88"
     }
    },
    "omsagent": {
     "enabled": true,
     "config": {
      "logAnalyticsWorkspaceResourceID": "/subscriptions/<suppressed>/resourceGroups/rg-<suppressed>-playground-d-datadog/providers/Microsoft.OperationalInsights/workspaces/log-<suppressed>-d-datadog",
      "useAADAuth": "true"
     }
    }
   },
   "nodeResourceGroup": "rg-<suppressed>-playground-d-datadog.aks",
   "enableRBAC": true,
   "supportPlan": "KubernetesOfficial",
   "networkProfile": {
    "networkPlugin": "azure",
    "networkPluginMode": "overlay",
    "networkPolicy": "azure",
    "networkDataplane": "azure",
    "loadBalancerSku": "Standard",
    "loadBalancerProfile": {
     "managedOutboundIPs": {
      "count": 1
     },
     "effectiveOutboundIPs": [
      {
       "id": "/subscriptions/<suppressed>/resourceGroups/rg-<suppressed>-playground-d-datadog.aks/providers/Microsoft.Network/publicIPAddresses/<suppressed>"
      }
     ]
    },
    "podCidr": "172.17.0.0/16",
    "serviceCidr": "192.168.8.0/24",
    "dnsServiceIP": "192.168.8.10",
    "outboundType": "loadBalancer",
    "podCidrs": [
     "172.17.0.0/16"
    ],
    "serviceCidrs": [
     "192.168.8.0/24"
    ],
    "ipFamilies": [
     "IPv4"
    ]
   },
   "aadProfile": {
    "managed": true,
    "adminGroupObjectIDs": null,
    "adminUsers": null,
    "enableAzureRBAC": true,
    "tenantID": "<suppressed>"
   },
   "maxAgentPools": 100,
   "privateLinkResources": [
    {
     "id": "/subscriptions/<suppressed>/resourcegroups/rg-<suppressed>-playground-d-datadog/providers/Microsoft.ContainerService/managedClusters/aks-<suppressed>-d-datadog-001/privateLinkResources/management",
     "name": "management",
     "type": "Microsoft.ContainerService/managedClusters/privateLinkResources",
     "groupId": "management",
     "requiredMembers": [
      "management"
     ]
    }
   ],
   "apiServerAccessProfile": {
    "enablePrivateCluster": true,
    "privateDNSZone": "system",
    "enablePrivateClusterPublicFQDN": false
   },
   "identityProfile": {
    "kubeletidentity": {
     "resourceId": "/subscriptions/<suppressed>/resourcegroups/rg-<suppressed>-playground-d-datadog.aks/providers/Microsoft.ManagedIdentity/userAssignedIdentities/aks-<suppressed>-d-datadog-001-agentpool",
     "clientId": "<suppressed>",
     "objectId": "<suppressed>"
    }
   },
   "autoUpgradeProfile": {
    "nodeOSUpgradeChannel": "NodeImage"
   },
   "disableLocalAccounts": false,
   "securityProfile": {
    "defender": {
     "logAnalyticsWorkspaceResourceId": "/subscriptions/<suppressed>/resourceGroups/DefaultResourceGroup-WEU/providers/Microsoft.OperationalInsights/workspaces/DefaultWorkspace-<suppressed>-WEU",
     "securityMonitoring": {
      "enabled": true
     }
    }
   },
   "storageProfile": {
    "diskCSIDriver": {
     "enabled": true
    },
    "fileCSIDriver": {
     "enabled": true
    },
    "snapshotController": {
     "enabled": true
    }
   },
   "oidcIssuerProfile": {
    "enabled": false
   },
   "workloadAutoScalerProfile": {}
  },
  "identity": {
   "type": "SystemAssigned",
   "principalId": "<suppressed>",
   "tenantId": "<suppressed>"
  },
  "sku": {
   "name": "Base",
   "tier": "Free"
  }
 }
DEBUG: urllib3.util.retry: Converted retries value: 1 -> Retry(total=1, connect=None, read=None, redirect=None, status=None)
DEBUG: msal.authority: openid_config = {'token_endpoint': 'https://login.microsoftonline.com/<suppressed>/oauth2/v2.0/token', 'token_endpoint_auth_methods_supported': ['client_secret_post', 'private_key_jwt', 'client_secret_basic'], 'jwks_uri': 'https://login.microsoftonline.com/<suppressed>/discovery/v2.0/keys', 'response_modes_supported': ['query', 'fragment', 'form_post'], 'subject_types_supported': ['pairwise'], 'id_token_signing_alg_values_supported': ['RS256'], 'response_types_supported': ['code', 'id_token', 'code id_token', 'id_token token'], 'scopes_supported': ['openid', 'profile', 'email', 'offline_access'], 'issuer': 'https://login.microsoftonline.com/<suppressed>/v2.0', 'request_uri_parameter_supported': False, 'userinfo_endpoint': 'https://graph.microsoft.com/oidc/userinfo', 'authorization_endpoint': 'https://login.microsoftonline.com/<suppressed>/oauth2/v2.0/authorize', 'device_authorization_endpoint': 'https://login.microsoftonline.com/<suppressed>/oauth2/v2.0/devicecode', 'http_logout_supported': True, 'frontchannel_logout_supported': True, 'end_session_endpoint': 'https://login.microsoftonline.com/<suppressed>/oauth2/v2.0/logout', 'claims_supported': ['sub', 'iss', 'cloud_instance_name', 'cloud_instance_host_name', 'cloud_graph_host_name', 'msgraph_host', 'aud', 'exp', 'iat', 'auth_time', 'acr', 'nonce', 'preferred_username', 'name', 'tid', 'ver', 'at_hash', 'c_hash', 'email'], 'kerberos_endpoint': 'https://login.microsoftonline.com/<suppressed>/kerberos', 'tenant_region_scope': 'EU', 'cloud_instance_name': 'microsoftonline.com', 'cloud_graph_host_name': 'graph.windows.net', 'msgraph_host': 'graph.microsoft.com', 'rbac_url': 'https://pas.windows.net'}
DEBUG: msal.application: Broker enabled? False
DEBUG: cli.azure.cli.core.auth.msal_authentication: UserCredential.get_token: scopes=('6dae42f8-4368-4678-94ff-3960e28e3630/.default',), claims=None, kwargs={}
DEBUG: msal.application: Cache hit an AT
DEBUG: msal.telemetry: Generate or reuse correlation_id: c543ca19-0887-4b85-8157-7ed30965e34a
DEBUG: cli.azure.cli.core.sdk.policies: Request URL: 'https://management.azure.com/subscriptions/<suppressed>/resourceGroups/rg-<suppressed>-playground-d-datadog/providers/Microsoft.ContainerService/managedClusters/aks-<suppressed>-d-datadog-001/runCommand?api-version=2023-07-01'
DEBUG: cli.azure.cli.core.sdk.policies: Request method: 'POST'
DEBUG: cli.azure.cli.core.sdk.policies: Request headers:
DEBUG: cli.azure.cli.core.sdk.policies:     'Content-Type': 'application/json'
DEBUG: cli.azure.cli.core.sdk.policies:     'Content-Length': '8795'
DEBUG: cli.azure.cli.core.sdk.policies:     'Accept': 'application/json'
DEBUG: cli.azure.cli.core.sdk.policies:     'x-ms-client-request-id': '3169f6d0-7cb3-11ee-9d12-5e4835e9c80c'
DEBUG: cli.azure.cli.core.sdk.policies:     'CommandName': 'aks command invoke'
DEBUG: cli.azure.cli.core.sdk.policies:     'ParameterSetName': '-g -n --command --debug'
DEBUG: cli.azure.cli.core.sdk.policies:     'User-Agent': 'AZURECLI/2.53.0 (HOMEBREW) azsdk-python-azure-mgmt-containerservice/26.0.0 Python/3.10.13 (macOS-13.6.1-arm64-arm-64bit)'
DEBUG: cli.azure.cli.core.sdk.policies:     'Authorization': '*****'
DEBUG: cli.azure.cli.core.sdk.policies: Request body:
DEBUG: cli.azure.cli.core.sdk.policies: {"command": "kubectl get ns", "context": "", "clusterToken": "<suppressed>"}
DEBUG: urllib3.connectionpool: https://management.azure.com:443 "POST /subscriptions/<suppressed>/resourceGroups/rg-<suppressed>-playground-d-datadog/providers/Microsoft.ContainerService/managedClusters/aks-<suppressed>-d-datadog-001/runCommand?api-version=2023-07-01 HTTP/1.1" 400 389
DEBUG: cli.azure.cli.core.sdk.policies: Response status: 400
DEBUG: cli.azure.cli.core.sdk.policies: Response headers:
DEBUG: cli.azure.cli.core.sdk.policies:     'Cache-Control': 'no-cache'
DEBUG: cli.azure.cli.core.sdk.policies:     'Pragma': 'no-cache'
DEBUG: cli.azure.cli.core.sdk.policies:     'Content-Length': '389'
DEBUG: cli.azure.cli.core.sdk.policies:     'Content-Type': 'application/json'
DEBUG: cli.azure.cli.core.sdk.policies:     'Expires': '-1'
DEBUG: cli.azure.cli.core.sdk.policies:     'x-ms-ratelimit-remaining-subscription-writes': '1199'
DEBUG: cli.azure.cli.core.sdk.policies:     'x-ms-correlation-request-id': 'bbfae8bb-fb8e-4094-b50a-50f6bda75605'
DEBUG: cli.azure.cli.core.sdk.policies:     'x-ms-request-id': '3e95f88a-67b5-4ccb-b4b0-fc5e6b5cb859'
DEBUG: cli.azure.cli.core.sdk.policies:     'Strict-Transport-Security': 'max-age=31536000; includeSubDomains'
DEBUG: cli.azure.cli.core.sdk.policies:     'x-ms-routing-request-id': 'FRANCESOUTH:20231106T144628Z:bbfae8bb-fb8e-4094-b50a-50f6bda75605'
DEBUG: cli.azure.cli.core.sdk.policies:     'X-Content-Type-Options': 'nosniff'
DEBUG: cli.azure.cli.core.sdk.policies:     'Date': 'Mon, 06 Nov 2023 14:46:27 GMT'
DEBUG: cli.azure.cli.core.sdk.policies: Response content:
DEBUG: cli.azure.cli.core.sdk.policies: {
  "code": "KubernetesPerformanceError",
  "details": null,
  "message": "Failed to run command due to cluster perf issue, container command-3e95f88a67b54ccbb4b0fc5e6b5cb859 in aks-command namespace did not start within 30s on your cluster, retry may helps. If issue persist, you may need to tune your cluster with better performance (larger node/paid tier).",
  "subcode": "PerfError"
 }
DEBUG: cli.azure.cli.core.azclierror: Traceback (most recent call last):
  File "/opt/homebrew/Cellar/azure-cli/2.53.0/libexec/lib/python3.10/site-packages/knack/cli.py", line 233, in invoke
    cmd_result = self.invocation.execute(args)
  File "/opt/homebrew/Cellar/azure-cli/2.53.0/libexec/lib/python3.10/site-packages/azure/cli/core/commands/__init__.py", line 663, in execute
    raise ex
  File "/opt/homebrew/Cellar/azure-cli/2.53.0/libexec/lib/python3.10/site-packages/azure/cli/core/commands/__init__.py", line 726, in _run_jobs_serially
    results.append(self._run_job(expanded_arg, cmd_copy))
  File "/opt/homebrew/Cellar/azure-cli/2.53.0/libexec/lib/python3.10/site-packages/azure/cli/core/commands/__init__.py", line 697, in _run_job
    result = cmd_copy(params)
  File "/opt/homebrew/Cellar/azure-cli/2.53.0/libexec/lib/python3.10/site-packages/azure/cli/core/commands/__init__.py", line 333, in __call__
    return self.handler(*args, **kwargs)
  File "/opt/homebrew/Cellar/azure-cli/2.53.0/libexec/lib/python3.10/site-packages/azure/cli/core/commands/command_operation.py", line 121, in handler
    return op(**command_args)
  File "/opt/homebrew/Cellar/azure-cli/2.53.0/libexec/lib/python3.10/site-packages/azure/cli/command_modules/acs/custom.py", line 1911, in aks_runcommand
    command_result_poller = sdk_no_wait(
  File "/opt/homebrew/Cellar/azure-cli/2.53.0/libexec/lib/python3.10/site-packages/azure/cli/core/util.py", line 716, in sdk_no_wait
    return func(*args, **kwargs)
  File "/opt/homebrew/Cellar/azure-cli/2.53.0/libexec/lib/python3.10/site-packages/azure/core/tracing/decorator.py", line 78, in wrapper_use_tracer
    return func(*args, **kwargs)
  File "/opt/homebrew/Cellar/azure-cli/2.53.0/libexec/lib/python3.10/site-packages/azure/mgmt/containerservice/v2023_07_01/operations/_managed_clusters_operations.py", line 3534, in begin_run_command
    raw_result = self._run_command_initial(
  File "/opt/homebrew/Cellar/azure-cli/2.53.0/libexec/lib/python3.10/site-packages/azure/mgmt/containerservice/v2023_07_01/operations/_managed_clusters_operations.py", line 3384, in _run_command_initial
    raise HttpResponseError(response=response, error_format=ARMErrorFormat)
azure.core.exceptions.HttpResponseError: (KubernetesPerformanceError) Failed to run command due to cluster perf issue, container command-3e95f88a67b54ccbb4b0fc5e6b5cb859 in aks-command namespace did not start within 30s on your cluster, retry may helps. If issue persist, you may need to tune your cluster with better performance (larger node/paid tier).
Code: KubernetesPerformanceError
Message: Failed to run command due to cluster perf issue, container command-3e95f88a67b54ccbb4b0fc5e6b5cb859 in aks-command namespace did not start within 30s on your cluster, retry may helps. If issue persist, you may need to tune your cluster with better performance (larger node/paid tier).

ERROR: cli.azure.cli.core.azclierror: (KubernetesPerformanceError) Failed to run command due to cluster perf issue, container command-3e95f88a67b54ccbb4b0fc5e6b5cb859 in aks-command namespace did not start within 30s on your cluster, retry may helps. If issue persist, you may need to tune your cluster with better performance (larger node/paid tier).
Code: KubernetesPerformanceError
Message: Failed to run command due to cluster perf issue, container command-3e95f88a67b54ccbb4b0fc5e6b5cb859 in aks-command namespace did not start within 30s on your cluster, retry may helps. If issue persist, you may need to tune your cluster with better performance (larger node/paid tier).
ERROR: az_command_data_logger: (KubernetesPerformanceError) Failed to run command due to cluster perf issue, container command-3e95f88a67b54ccbb4b0fc5e6b5cb859 in aks-command namespace did not start within 30s on your cluster, retry may helps. If issue persist, you may need to tune your cluster with better performance (larger node/paid tier).
Code: KubernetesPerformanceError
Message: Failed to run command due to cluster perf issue, container command-3e95f88a67b54ccbb4b0fc5e6b5cb859 in aks-command namespace did not start within 30s on your cluster, retry may helps. If issue persist, you may need to tune your cluster with better performance (larger node/paid tier).
DEBUG: cli.knack.cli: Event: Cli.PostExecute [<function AzCliLogging.deinit_cmd_metadata_logging at 0x10659f010>]
INFO: az_command_data_logger: exit code: 1
INFO: cli.__main__: Command ran in 31.964 seconds (init: 0.077, invoke: 31.887)
INFO: telemetry.main: Begin splitting cli events and extra events, total events: 1
INFO: telemetry.client: Accumulated 0 events. Flush the clients.
INFO: telemetry.main: Finish splitting cli events and extra events, cli events: 1
INFO: telemetry.save: Save telemetry record of length 4180 in cache
INFO: telemetry.main: Begin creating telemetry upload process.
INFO: telemetry.process: Creating upload process: "/opt/homebrew/Cellar/azure-cli/2.53.0/libexec/bin/python /opt/homebrew/Cellar/azure-cli/2.53.0/libexec/lib/python3.10/site-packages/azure/cli/telemetry/__init__.py /Users/<suppressed>/.azure"
INFO: telemetry.process: Return from creating process
INFO: telemetry.main: Finish creating telemetry upload process.

Expected behavior

$ az aks command invoke -g $rgname -n $aksname --command "kubectl get ns"
command started at 2023-11-06 14:18:58+00:00, finished at 2023-11-06 14:18:59+00:00 with exitcode=0
NAME              STATUS   AGE
aks-command       Active   3s
default           Active   7m23s
kube-node-lease   Active   7m26s
kube-public       Active   7m26s
kube-system       Active   7m26s
$

Environment Summary

azure-cli                         2.53.0 *

core                              2.53.0 *
telemetry                          1.1.0

Extensions:
arcappliance                      0.2.33
azure-firewall                    0.14.8
azurestackhci                      0.2.9
bastion                            0.2.5
customlocation                     0.1.3
init                               0.1.0
k8s-extension                      1.5.0
ssh                                1.1.6

Dependencies:
msal                            1.24.0b2
azure-mgmt-resource             23.1.0b2

Python location '/opt/homebrew/Cellar/azure-cli/2.53.0/libexec/bin/python'
Extensions directory '/Users/z004ctza/.azure/cliextensions'

Python (Darwin) 3.10.13 (main, Aug 24 2023, 22:36:46) [Clang 14.0.3 (clang-1403.0.22.14.1)]

Legal docs and information: aka.ms/AzureCliLegal

Additional context

Cluster created with:

az aks create \
    -l $location \
    -g $rgname \
    -n $aksname \
    -c 1 \
    --enable-addons monitoring \
    --enable-msi-auth-for-monitoring true \
    -u rofz \
    --ssh-key-value ~/.ssh/mykey.pub \
    --disable-public-fqdn \
    --enable-aad \
    --enable-azure-rbac \
    --enable-defender \
    --enable-encryption-at-host \
    --enable-private-cluster \
    --enable-syslog \
    --network-plugin azure \
    --network-plugin-mode overlay \
    --network-policy azure \
    --node-os-upgrade-channel NodeImage \
    --node-osdisk-size 32 \
    --node-osdisk-type Ephemeral \
    --node-resource-group $aksnodesrgname \
    -s Standard_D2d_v5 \
    --os-sku Ubuntu \
    --vnet-subnet-id $akssubnetid \
    --workspace-resource-id $lawid \
    --pod-cidr 172.17.0.0/16 \
    --service-cidr 192.168.8.0/24 \
    --dns-service-ip 192.168.8.10

A cluster created without the --enable-aad and --enable-azure-rbac works fine.

This replicates issues #22738 and #23203.

Cluster's subnet has a route table with a single route pointing 0.0.0.0/0 Address prefix to Internet (Next Hop type).

azure-client-tools-bot-prd[bot] commented 1 year ago

Hi @RoFz,

2.53.0 is not the latest Azure CLI(2.53.1).

If you haven't already attempted to do so, please upgrade to the latest Azure CLI version by following https://learn.microsoft.com/en-us/cli/azure/update-azure-cli.

yonzhan commented 1 year ago

Thank you for opening this issue, we will look into it.

RoFz commented 1 year ago

Issue is the same with latest Azure CLI(2.53.1).

bramdehart commented 7 months ago

Any updates on this. We face the same problems.

Oliezhensev commented 7 months ago

Got same issue, any updates?