In this repo, you will learn about some of the AKS features that make it easier for developers to deploy that code without having to worry too much about infrastructure. Begin by cloning the repository.
:warning: You will need to fork this repo and clone your fork in order to complete the steps requiring updating the GitHub(GH) repository and using GH Actions for deployment. I recommend you do this and clone your repo instead of the repo used in the command below.
git clone --recurse-submodules https://github.com/mosabami/SuperAKS-App
cd SuperAKS-App
In this workshop, you are a developer who created the POC of a promotional app for your organization that showcases one of their ML API. The app includes a ML model that identifies the location of faces in pictures provided to it via URL. You have written the code and would love to deploy it online. You have chosen to deploy it to a k8s cluster. You have heard AKS is the best place for kubernetes. You decide to try it out yourself.
AKS has a lot of amazing features that makes software development and delivery very easy. It takes care of a lot of the infrastructure security related heavy lifting for you. In this workshop we will discuss the following features of AKS:
If there are other AKS features you'd like to see here that help with developer productivity, please create an issue. PRs are also welcome!
Check out the Prerequisites page for information on this.
Introduce a bug in the code so that we can correct it later
:warning: If you are using a mac you will need to change the command to
sed -i '' "s/bug = false/bug = true/" smartbrain/smartbrain/smartbrainapi/server.js
.:bulb: If these sed commands don't work for any reason or if you don't have sed installed, you will need to update these files manually by replacing the placeholders in the files mentioned below.
sed -i "s/bug = false/bug = true/" smartbrain/smartbrain/smartbrainapi/server.js
If you have docker desktop install and started on your computer, and you have some experience with docker-compose you can run the application on your local computer.
Run the application using docker-compose.
cd smartbrain/smartbrain
docker-compose up
You can access the website at port 3050 on your local computer using NGINX as an ingress controller. Check out the docker-compose.yaml file for more details. You can test the app by clicking on register and providing the required information.
You will notice that even though the app is able to detect the face, there is an error in the detection count. This is the error we intentionally added earlier. We fill fix this at a later step. Here is what the architecture of the app looks like
CD out of the app's folder.
cd ../..
Now that we have seen the app running locally, it is time to deploy it to AKS. There are preview features being used including workload identity and CNI overlay. You will need to ensure these features are enabled in your subscription before proceeding with the deployment.
AKS Construction (AKSC) is part of the AKS landing zone accelerator program and allows rapid development and deployment of secure AKS clusters and its supporting resources using IaC (mostly Bicep), Azure CLI and/or GitHub Actions. AKS Landing Zone Accelerator is a set of tools, resources and guidance that helps deploy and operationalize secure and scalable AKS and supporting services rapidly, AKS Construction helper being one of them. Check out the official docs for more information.
:warning: It is very important to note that the AKS-LZA can be used to develop secure and compliant AKS clusters that are (almost) ready for production. However, many of the best practice guidance are not used in this implementation to facilitate easy learning and deployment in this workshop. Do not use this configuration for production workloads. To deploy a more secure environment, consider reading the AKS-LZA docs and/or deploy your environment using a configuration similar to this.
if you used the --recurse-submodules flag when cloning this repo, you wont need to clone the AKSC repository which has the bicep files our deployment depends on into the IaC folder.
cd IaC
# git clone https://github.com/Azure/AKS-Construction
Get the signed in user id so that you can get admin access to the cluster you create
SIGNEDINUSER=$(az ad signed-in-user show --query id --out tsv)
RGNAME=superapp
Create deployment
az group create -n $RGNAME -l EastUs
DEP=$(az deployment group create -g $RGNAME --parameters signedinuser=$SIGNEDINUSER -f main.bicep -o json)
:bulb: The DEP variable is very important and will be used in subsequent steps. You can save it by running
echo $DEP > test.json
and restore it by runningexport DEP=$(cat test.json)
Get required variables
KVNAME=$(echo $DEP | jq -r '.properties.outputs.kvAppName.value')
OIDCISSUERURL=$(echo $DEP | jq -r '.properties.outputs.aksOidcIssuerUrl.value')
AKSCLUSTER=$(echo $DEP | jq -r '.properties.outputs.aksClusterName.value')
SUPERAPPID=$(echo $DEP | jq -r '.properties.outputs.idsuperappClientId.value')
TENANTID=$(az account show --query tenantId -o tsv)
ACRNAME=$(az acr list -g $RGNAME --query [0].name -o tsv)
Log into AKS and deploy NGINX ingress.
az aks get-credentials -n $AKSCLUSTER -g $RGNAME --overwrite-existing
kubectl get nodes
Run the command in the official nginx docs page for Azure to install the nginx ingress controller that will be used later
Create the superapp namespace that will be required for future steps
kubectl create namespace superapp
cd out of IaC folder
cd ..
We will build images from source code and pull database images from Dockerhub. We will store these images in our container registry to stay in compliance with our policy to only use images in approved registry
Build front end image
cd smartbrain/smartbrain/smartbrainclient
az acr build -t client:v1 -r $ACRNAME --resource-group $RGNAME .
Build api image
cd ../smartbrainapi
az acr build -t server:v1 -r $ACRNAME --resource-group $RGNAME .
Build ML service image
cd ../smartbrainml
az acr build -t worker:v1 -r $ACRNAME --resource-group $RGNAME .
Import redis and postgres images from Dockerhub
az acr import --name $ACRNAME --source docker.io/library/redis:latest --resource-group $RGNAME
az acr import --name $ACRNAME --source docker.io/library/postgres:latest --resource-group $RGNAME
Verify that the 5 required images are in the container registry
az acr repository list --name $ACRNAME --resource-group $RGNAME
Draft is a tool that makes it easy to develop resources required to deploy applications to kubernetes. This includes the creation of Docker files, Kubernetes manifest files, Helm charts, Kustomize files, GitHub Action pipelines, etc. In this section, we will be showcasing the use of Draft to crete Kubernetes manifest files to speed up the creation of resources required to deploy to kubernetes using the Developer Tools for Azure Kubernetes Service extension. For more information about the features of this extension, check out this video that walks you through it later if you are a Microsoft employee or check out the demo repo otherwise.
:warning: For the Draft and Developer Tools for AKS extension to work properly, you need to ensure your file path is not too long. Make sure you are working off a folder that doesn't have a long file path.
:bulb: At this point, you might want to switch to a different branch so that the changes you are about to make to the files don't carry on to your main branch allowing you to rerun this workshop in the future easily:
git checkout -b test
AKS DevX tool will automatically create a draft of deployment and service manifest files ready for you to modify to suit your needs within the manifests folder in the folder you selected as your output folder. These files can then be updated to include required environment variables, specify resources and limits, proper labeling for deployment and service selectors, etc.
For the rest of this workshop however, we will use the manifest files already provided in the k8s folder in the root directory.
Change the deployment files to use the proper container registry names using sed commands.
:warning: If you are using a mac you will need to change the command to
sed -i '' "s/<ACR name>/$ACRNAME/" client-deployment.yaml
.:bulb: If these sed commands don't work for any reason or if you don't have sed installed, you will need to update these files manually by replacing the placeholders in the files mentioned below.
:warning: CD to the root directory before proceeding to the next steps
cd k8s
sed -i "s/<ACR name>/$ACRNAME/" client-deployment.yaml
sed -i "s/<ACR name>/$ACRNAME/" postgres-deployment.yaml
sed -i "s/<ACR name>/$ACRNAME/" redis-deployment.yaml
sed -i "s/<ACR name>/$ACRNAME/" api-deployment.yaml
sed -i "s/<ACR name>/$ACRNAME/" worker-deployment.yaml
Update the secret provider class file
sed -i "s/<identity clientID>/$SUPERAPPID/" secret-provider-class.yaml
sed -i "s/<kv name>/$KVNAME/" secret-provider-class.yaml
sed -i "s/<tenant ID>/$TENANTID/" secret-provider-class.yaml
Update the service account files. These service accounts are using workload identity federated identity.
sed -i "s/<identity clientID>/$SUPERAPPID/" svc-accounts.yaml
sed -i "s/<tenant ID>/$TENANTID/" svc-accounts.yaml
Deploy the resources into the superapp namespace.
kubectl apply -f .
:bulb: Since the API pod depends on the postgres pod being deployed, we included an init container in the API deployment file which will ensure that the postgres pod is ready before starting the api pod. Check out the API pod definition file to see how to use an init container.
So what have we done here? We are using workload identities. Workload identities is a soon to be released AKS feature that allows you to use any of various identity providers as the identity of your pod. In this case we are using Azure AD as the identity provider and using the AKS cluster as the OIDC issuer. In the future, you will be ale to use other identity providers as well. This identity will only be assigned to the pods that are using the service account attached to the identity. This way other pods within the same node wont have the same access. This is important for securing your workloads by providing minimum access. In this case, we are using the identity to get access to the Azure Keyvault. Only this identity and consequently the pods configured to use the identity will be able to pull secrets from it and get the postgres database password. Check out the postgres and server deployment yaml files as well as the svc accounts and secret provider class yaml files for more details.
But how was the workload identity deployed? Check the resources towards the end of the main.bicep file in the IaC folder as well as the workloadid.bicep file. The kvrbac.bicep file shows how the workload identity was granted access to keyvault to pull secrets as well as how the postgres password was created.
You can now access your application on a web browser (or postman) using the nginx ingress controller you deployed. You will need the ip address of the ingress controller's service.
kubectl get ingress -n superapp
:bulb: You might want to set context of your kubeconfig to use the superapp namespace by default so you dont have to add
-n superapp
at the end of every command. To do this, run the command:kubectl config set-context $(kubectl config current-context) --namespace=superapp
You should be able use the ip address as shown in the screenshot below. You can test the app by entering a number under 40, clicking submit and refreshing the page.
AKS makes it easy to monitor your applications using various tools including Prometheus, Grafana, and Azure monitor. In this workshop, we will be using container insights.
We tested the app using a single user accessing it using the website. But how do we ensure our application will hold when there are hundreds or thousands of users using it at once? We will use Azure load testing using a JMeter test script to test this. We will see if our cluster scales and learn how easy it is to enable scaling. For more general information about this check out the Scalability scenario on AKS-LZA.
Open Azure portal in two tabs. In the first one, navigate to container insights to see the usage of your cluster. AKS resource -> Insights (in the left blade under monitoring) -> "Containers" tab (the containers tab is found in the top middle of the page) -> Time range (which can be found at the top left no in the blade) -> last 30 minutes -> Apply
Here you should be able to see the usage of the pods over the last 30 minutes. Filter to only show pods in your node pool. "Add filter" -> Namespace -> superapp
To help the api pod with the load, we will deploy a horizontal pod autoscaler for the api deployment. cd to the ./loadtesting folder and deploy the horizontal pod autoscaler. Watch the pods to see if there are any new ones being created as we run the load test again
cd ../loadtesting
kubectl apply -f .
kubectl get pods -n superapp -w
Head back to the load test tab and rerun the test by clicking on the "Rerun" button in the results page. After a minute or so, after the test is completed, you will find that some new pods have been created but many of them are sitting in pending state
kubectl get pods -n superapp
Heading over to Azure monitor will show that the worker pods that were able to be scheduled are all fully utilized And this is happening in spite of the fact that the node itself is not fully utilized as showed in the nodes tab This is happening because the total requests of all the scheduled pods has reached the available cpu available in the node even though the requested CPUs are not being fully utilized by all the pods in the node. To avoid this, set your requests numbers in your deployment manifest files to a lower number. For the sake of this demo however, we will leave it as is.
To allow more worker pods to be scheduled, we will enable Cluster Autoscaler. Cluster autoscaler is an AKS feature that allows the k8s control plane create new nodes and add them to the cluster node pool so that your application can scale automatically without having to worry about that. Run the command below the enable cluster autoscaler. You can also enable it at the time of cluster creation or by updating the bicep deployment scripts and rerunning it.
az aks update \
--resource-group $RGNAME \
--name $AKSCLUSTER \
--cluster-autoscaler-profile \
scale-down-unneeded-time=2m \
scale-down-utilization-threshold=0.8
az aks nodepool update \
--resource-group $RGNAME \
--cluster-name $AKSCLUSTER \
--name npuser01 \
--enable-cluster-autoscaler \
--min-count 1 \
--max-count 5
Rerun the test again to see pods scheduled to help with the load but this time, let us increase the load. In the test run result screen click on "View all test runs" in the top right corner. Then click on "Configure" -> "Test", then head over to the "Parameters" tab. Update the threads to 50 and loops to 250. Click "Apply" at the bottom left. This will take you to your tests screen. Click on the test you just modified, and click "Run" at the top of the resulting page, then click "Run" to run the test. Click "Refresh" at the top left side of the screen then click on the run you just executed.
You will find that the test completed successfully. Entering kubectl get pods -n superapp after a minute will show pods in the pending state. Wait a couple of minutes and you will see a new node created. Check to see which pods are in the new node.
kubectl get pods -n superapp -o wide
After a couple of minutes, heading over to the "Nodes" tab of Azure monitor shows the new node and its utilization.
Lets test it again but this time with threads set to 300 and loops set to 450. You can also update the loadtesting/worker-hpa.yaml and server-hpa.yaml files to increase the maxReplicas
numbers.
This exercise shows one of the advantages of containers and kubernetes. You can optimize the utilization of your nodes (virtual machines) and scale various components of your application independently to increase and decrease based on the demand on that particular service. With AKS CNI overlay, you don't have to worry about IP exhaustion. The overlay network takes care of that for you. You can have max 250 pods in each node with CNI overlay and those IP addresses will be from a different IP space than your node (and virtual network).
Bridge to kubernetes is an amazing tool that allows developers debug and test their code by running their Microservice locally on their computer and having it connect to other microservices running in their kubernetes cluster. This way, they can test changes they make to their local microservice against the entire application already running on kubernetes. For more information about this, check out this video. For Bridge to work, you need to be able to run the application locally. We begin by installing the packages required to run the worker node express microservice locally.
:warning: To begin you might have to set the context of kubeconfig to use the superapp namespace in order for bridge to kubernetes to work properly. You would also want to delete the api hpa and redeploy the worker deployment.
kubectl config set-context --current --namespace=superapp
kubectl delete -f api-hpa.yaml
cd ../k8s
kubectl delete -f api-deployment.yaml && kubectl apply -f api-deployment.yaml
CD to the directory that has your api server code
cd ../smartbrain/smartbrain/smartbrainapi/
Install the required packages
npm install
Open the command pallet (you can do this by shortcut ctrl + shift + p)
Type "bridge" and select "Bridge to Kubernetes: Configure"
Select the "api-service" service
Enter "5000" as the port
Select "Configure Bridge to Kubernetes without a launch configuration"
Select "No" since you are the only one working on this application. You should see a notification similar to below stating your configuration was successful.
Click on the "Kubernetes" status bar menu
Click on the "Connect to the cluster" button that pops up at the top of the screen and wait for the connection to be established
When you see the pop-up, click on "Continue" and then "Yes" to provide Bridge to kubernetes the required permission
Once the connection is complete, take note of the host address of the postgres-service. You might need it if you have service discovery issues
You can open the smartbrain\smartbrain\smartbrainapi\server.js file, put a break point on the last line of the script and click on the debug tab to the left
Click on the green Play button next to "Launch Program" at the top left side of the screen to begin debugging
Click on the "Continue" button of the debugger to complete the run
If you head to the "DEBUG CONSOLE" tab and see an error about connecting to postgres, it means you are having service discovery issues.
BUGFIX: To fix postgres connection error: In your smartbrain\smartbrain\smartbrainapi\server.js file, find the connection
variable and replace host in the object with the host address of the postgres-service noted earlier
Run the debugger and on the browser try detecting the face in the picture. You will that the detect count error persists ![incorrect rank]
Head to the smartbrain\smartbrain\smartbrainapi\server.js and fix the error by changing the variable, bug to false
Head back to the index file and run the debugger again
Refresh the browser, sign in and try detecting a face again. You will see that the detect count now works perperly.
You can stop the debugger. Once you are satisfied with the changes, you can disconnect from bridge to kubernetes by clicking on the Kubernetes status bar menu and clicking on "Disconnect current session". You can then head to your Terminal tab and hit any key to close the connection. You might have to delete the current api deployment and redeploy it again to restore connection to the worker pod running on AKS.
kubectl delete deployment api-deployment
kubectl apply -f api-deployment.yaml
:warning: Before you proceed, set the host value in the connection object back to pgHost (assuming your replaced that variable with the postgres IP address to fix the postgres connection error above)
You can now push your changes to GitHub
git add .
git commit -m "fix ranking error and update manifest files"
git push origin test # or main assuming that is the branch you are using
Now that we have pushed our changes to GitHub, you can follow the instructions in this section of the demo repo to use the "Automated Deployment" feature on AKS.
You will be using the automated deployment to update the api deployment since that is service that has the bug that was fixed. When prompted to specify your folder ensure you are pointing to the "smartbrain\smartbrain\smartbrainapi" path.
Please note that this is a preview feature that currently has two bugs which you can easily fix to deploy your changes. You can follow the steps below after following the instructions above to deploy your workflow which will fail.
The first one is because DevHub hasn't been updated to use Kubelogin. Add:
- name: Set up kubelogin for non-interactive login
uses: azure/use-kubelogin@v1
with:
kubelogin-version: 'v0.0.24'
to the generated workflow in .github/workflows folder right before aks-set-context. You will also need to add the following to AKS set context (under with):
admin: 'false'
use-kubelogin: 'true'
Your pipeline file should look similar to the picture below:
The second bug only occurs in AKS clusters which use RBAC for authorization (which is the case in the cluster are using in this workshop). It occurs because the OIDC issuer which provides an identity to the GitHub deployment runner isnt providing sufficient permission to that identity.
As you have seen, AKS has lots of features that help with developer productivity by automating steps for you. These can help you get to production very quickly. There are many tools that were not covered in this workshop including the following:
What AKS features for developer productivity would you like me to showcase next? Let me know by creating issues or commenting on existing ones.