Closed JarkoDubbeldam closed 1 year ago
Hi @JarkoDubbeldam, are you using a service principal for your AKS clusters to access other Azure Active Directory (Azure AD) resources? we are now do not support service principal with AKS, more detail you can see the limitaions of azureml-extension.
I created the cluster with the --enable-managed-identity
flag in the azurecli. So from what I can tell that should be in the clear.
Hi @jiaochenlu , I have same issue. I created AKS cluster by python SDK with code attached below and run install extension by azure cli .
def create_cluster(
subscription_id: str,
resource_group: str,
cluster_name: str,
location: str,
app_id: str,
app_secret: str,
):
client:ContainerServiceClient = _manage_client_factory(subscription_id)
mc_models = client.managed_clusters.models
pooler: LROPoller = client.managed_clusters.begin_create_or_update(
resource_group,
cluster_name,
parameters=mc_models.ManagedCluster(
identity=mc_models.ManagedClusterIdentity(
type=mc_models.ResourceIdentityType.system_assigned
),
location=location,
dns_prefix=cluster_name,
agent_pool_profiles=[
mc_models.ManagedClusterAgentPoolProfile(
name="default1",
count=1,
vm_size=mc_models.ContainerServiceVMSizeTypes.STANDARD_B2_S,
mode=mc_models.AgentPoolMode.SYSTEM,
scale_set_priority=mc_models.ScaleSetPriority.REGULAR,
),
mc_models.ManagedClusterAgentPoolProfile(
name="gpuproc1",
count=0,
vm_size='Standard_NC4as_T4_v3',
mode=mc_models.AgentPoolMode.USER,
scale_set_priority=mc_models.ScaleSetPriority.SPOT,
)
],
# addon_profiles=[
# mc_models.ManagedClusterAddonProfile(
# enabled=True,
# config=
# ),
# ],
),
)
manage_cluser = pooler.result()
$ az k8s-extension create --cluster-name [cluster-name] --cluster-type managedClusters --resource-group [resource-group-name] --scope cluster --extension-type Microsoft.AzureML.Kubernetes --name azure-ml --config enableTraining=False enableInference=True inferenceRouterServiceType=LoadBalancer allowInsecureConnections=True inferenceLoadBalancerHA=False --cluster-type managedClusters
Troubleshooting: https://aka.ms/arcmltsg
SSL is not enabled. Allowing insecure connections to the deployed services.
'Extensions' cannot be used because 'Microsoft.KubernetesConfiguration' provider has not been registered.More details for registering this provider can be found here - https://aka.ms/RegisterKubernetesConfigurationProvider
(ExtensionOperationFailed) The extension operation failed with the following error: Request failed to https://management.azure.com/subscriptions/[subscription_id]/resourceGroups/smarteye-ml/providers/Microsoft.ContainerService/managedclusters/[cluster-name]/extensionaddons/azure-ml?api-version=2021-03-01. Error code: Unauthorized. Reason: Unauthorized.{"error":{"code":"InvalidAuthenticationToken","message":"The received access token is not valid: at least one of the claims 'puid' or 'altsecid' or 'oid' should be present. If you are accessing as application please make sure service principal is properly created in the tenant."}}.
Code: ExtensionOperationFailed
Message: The extension operation failed with the following error: Request failed to https://management.azure.com/subscriptions/[subscription_id]/resourceGroups/smarteye-ml/providers/Microsoft.ContainerService/managedclusters/ne-aks-mlflow-sandbox/extensionaddons/azure-ml?api-version=2021-03-01. Error code: Unauthorized. Reason: Unauthorized.{"error":{"code":"InvalidAuthenticationToken","message":"The received access token is not valid: at least one of the claims 'puid' or 'altsecid' or 'oid' should be present. If you are accessing as application please make sure service principal is properly created in the tenant."}}.
@JarkoDubbeldam @alipek It seems that this issue occurred before the k8s-extension was installed, we have contacted the relevant team to investigate the cause of the error, and I will follow up here.
This happens if the Microsoft.KubernetesConfiguration ResourceProvider is not registered for the Subscription. Please register this resourceProvider, in your Subscription and confirm the registration status changes to 'Registered'.
Now, pl. create a new AKS Cluster and install the extension.
Thanks for the solution provided by @NarayanThiru
@JarkoDubbeldam @alipek Hi, have you created k8s-extension successfully now? Could we mitigate this issue now?
Thanks @NarayanThiru this working now when I register ResourceProvider
That provider was already registered in my subscription. I did recreate the entire cluster just now, and for some reason it does work now. Not sure what changed exactly. I guess this can stay closed. Below is my complete script for completeness sake:
az provider show -n Microsoft.KubernetesConfiguration -o table
az group create -n aks-poc --location westeurope
az aks create -g aks-poc -n myAKSCluster --enable-managed-identity `
--node-count 1 --enable-addons monitoring --enable-msi-auth-for-monitoring `
--generate-ssh-keys
az k8s-extension create --name azureml --extension-type Microsoft.AzureML.Kubernetes `
--config enableTraining=True enableInference=True inferenceRouterServiceType=LoadBalancer `
allowInsecureConnections=True inferenceLoadBalancerHA=False --cluster-type managedClusters `
--cluster-name myAKSCluster --resource-group aks-poc --scope cluster
I am trying to connect a brand new AKS cluster (https://learn.microsoft.com/en-us/azure/aks/learn/quick-kubernetes-deploy-cli) to Azure ML. However, the step where I have to install the ML extension into AKS fails with an error I can't find anywhere in the troubleshooting guides. If this is the wrong repository for the error, I'm sorry.
It's unclear to me what authorization is an issue here.
Version info: