Closed Permander closed 3 years ago
@polichtm thank you for raising this, and for recommending the fix in your commit.
At first, I was not able to reproduce this on multiple Azure Subscriptions on our MSFT Tenant, but when I tested on an Azure Subscription on a personal tenant, this was reproducible:
This interesting behavior lead me to dig further into the root cause.
The reason this is not reproducible in MSFT tenant, but reproducible outside MSFT tenants is because this line of code is passing in the --custom-locations-oid
that is specific only to the MSFT tenant:
My understanding is, the reason this ObjectID
was hardcoded was to allow the Automation SP onboard the Arc Cluster end-to-end in Direct Connected mode without requiring human intervention.
In reality Customers can use the steps here to grab the unique ObjectID
for the Custom Locations RP
in their AAD tenant - but in case of Jumpstart, we can't query the AAD tenant with the Client SP because it doesn't have AAD level permissions (only Subscription level Contributor
).
Given this, my understanding is, there's 3 routes to solve it:
--kubeconfig
path, which does allow the Custom Location to get created correctly by bypassing the incorrect oid
:
I think this route should work for all Azure K8s scenarios since the kubeconfig
is available in the Client VM, but I'm not sure if this will work on EKS and GKE.
"Proper" route: As a pre-requisite to the Jumpstart ARM templates, have users run the following commands in their own AAD Tenant, and pass this correct ObjectID
in as part of the ARM template. This allows us to pass in the correct ObjectID
in this line, which allows us to onboard the K8s cluster via the Automation SP, without having the user perform manual intervention post-deployment.
"Proper" route 2: We could ask users elevate the Automation SP to have AAD querying permissions as well (so it can grab the ObjectID
by itself inside the ClientVM). This is more invasive than 2 since it exposes unnecessary permissions to the SP.
@polichtm - since you've figured out 1. Shortcut route already, please continue using it to unblock your exploration. @likamrat @dkirby-ms - let's discuss 2 & 3 since it's a fairly major change across all of our scenarios (including Arcbox etc.).
@mdrakiburrahman I have just tested the "shortcut" method on ArcBox with success. The specific changes are in the arcbox_customlocationfix branch and follow your pattern of dropping the --custom-locations-oid parameter from the k8s cluster onboarding, and passing the local kubeconfig file during custom location create with --kubeconfig parameter.
I am going to merge the ArcBox fix but the same fix I think will work in other data services scenarios but I have not tested them.
Scenario which you are working on https://azurearcjumpstart.io/azure_arc_jumpstart/azure_arc_data/aks/aks_postgresql_hyperscale_arm_template/
Describe the bug First phase of automation is completed without any bug but after RDP to client VM DataServicesLogonScript PowerShell logon script has started executing and got failed while creating custom location. In the script it's mentioned: az customlocation create --name 'jumpstart-cl'
--resource-group $env:resourceGroup
--namespace arc--host-resource-id $connectedClusterId
--cluster-extension-ids $extensionIderror which I have received is:- "Deployment failed. Correlation ID: b96f3532-2453-46be-a6f8-3c8de785cc0e. "Microsoft.ExtendedLocation" resource provider does not have the required permissions to create a namespace on the cluster. Refer to https://aka.ms/ArcK8sCustomLocationsDocsEnableFeature to provide the required permissions to the resource provider." After investing further, I have found that we need to pass --kubeconfig parameter for non-AAD enabled Cluster. Then I have checked my AKS cluster which has been created as a part of first phase of automation and it was non-AAD enabled Cluster. Then I have passed Admin Kubeconfig of Cluster and then this issue got resolved. az customlocation create --name 'jumpstart-cl'
--resource-group $env:resourceGroup
--namespace arc--host-resource-id $connectedClusterId
--cluster-extension-ids $extensionId --kubeconfigTo Reproduce Steps to reproduce the behavior:
Expected behavior A clear and concise description of what you expected to happen.
Screenshots If applicable, add screenshots to help explain your problem.
Additional context Add any other context about the problem here.