Closed AAkindele closed 3 years ago
Suggestions:
region1
variable in the following file to other acceptable regions, e.g., eastus
:
..\enterprise_scale\construction_sets\aks\online\aks_secure_baseline\configuration\global_settings.tfvars
name
variables in the following file with an unique string, e.g., demo
or your alias/username:
..\enterprise_scale\construction_sets\aks\online\aks_secure_baseline\configuration\global_settings.tfvars
Observations:
Troubleshooting cluster admin not set issue
If you run the script from the CAF (Cloud Adoption Framework) repo (https://github.com/Azure/caf-terraform-landingzones-starter/tree/starter/enterprise_scale/construction_sets/aks/online/aks_secure_baseline) as is, by default the code/file to create a AAD user group and add it as cluster admin is commented out. And you'll run into issue at step 6 in this doc (https://github.com/Azure/caf-terraform-landingzones-starter/blob/starter/enterprise_scale/construction_sets/aks/online/aks_secure_baseline/02-aks.md). When you run kubectl get pods -n a0008
, you will see neither of the two traefik ingress controllers are running (i.e., both have status 0/1
). When you open the application url from the browser, you'll see a "502 Bad Gateway" error message.
If you try to connect to the workload directly by running kubectl run curl -n a0008 -i --tty --rm --image=mcr.microsoft.com/azure-cli --limits='cpu=200m,memory=128Mi'
, you will get the following error message:
Error from server (Forbidden): pods is forbidden: User "abc@xyz" cannot create resource "pods" in API group "" in the namespace "a0008"
A temporary walkaround:
Now if you re-run kubectl run curl -n a0008 -i --tty --rm --image=mcr.microsoft.com/azure-cli --limits='cpu=200m,memory=128Mi'
to connect to the workload directly again, you won't see any error message. In the open shell, type curl -kI https://bu0001a0008-00.aks-ingress.contoso.com -w '%{remote_ip}\n'
. And you will see an IP address in the output. It should match the value of the Private DNS Zone created by the script. To verify, open the Private DNS Zone on the Azure portal, you will see the IP address listed in the "Value' column.
Solution 1:
ignore
with tfvars
for the following file:
..\enterprise_scale\construction_sets\aks\online\aks_secure_baseline\configuration\iam\iam_aad.ignore
..\enterprise_scale\construction_sets\aks\online\aks_secure_baseline\configuration\aks.tfvars
admin_group_object_ids = ["7304e4e7-b148-4ada-a135-6049c702d21e"]
azuread_groups = {
keys = ["aks_cluster_re1_admins"]
}
If you try to hit the url, the same "502 Bad Gateway" issue will still be there. The last step is to add yourself as the owner of the newly created AAD group on the Azure portal. You would have been a member already upon the completion of the script execution. If you run kubectl get pods -n a0008
again, you will see both traefik ingress controller up running.
Further improvement - update iam_aad.tfvars
file to automatically add yourself as the AAD group owner. Potentially parameterize your user info in a separate tfvars file.
Solution 2:
Other notes:
When running kubectl get pods -n a0008
, make sure both traefik ingress controllers is running. Expected output:
NAME READY STATUS RESTARTS AGE
aspnetapp-deployment-7ccf7cb7f9-6ltsd 1/1 Running 0 61s
aspnetapp-deployment-7ccf7cb7f9-wh2lp 1/1 Running 0 61s
traefik-ingress-controller-844fcdd859-k7dgj 0/1 Running 0 58s
traefik-ingress-controller-844fcdd859-p6g8w 0/1 Running 0 58s
Expected output on the browser
Description
What:
Why:
Where:
Tasks
Acceptance Criteria
Constraints
References: