Azure / aks-baseline-automation

Repository for the AKS Landing Zone Accelerator program's Automation reference implementation
MIT License
78 stars 135 forks source link
lza

This repository demonstrates recommended ways to automate the deployment of the components composing a typical AKS solution.

In order to manage the complexity of a Kubernetes based solution deployment, it is best to look at it in term of a separation of concerns. Which team in an enterprise environment should be concerned with what aspect of the deployment and what tools and processes should that team employ to best achieve their objectives. This implementation and associated documentation are intended to inform the interdisciplinary teams involved in AKS deployment and lifecycle management automation. These teams may include:

Each team will be responsible for maintaining their own automation pipeline. These pipelines access to Azure should only be granted through a Service Principal, a Managed Identity or preferably a Federated Identity with the minimum set of permissions required to automatically perform the tasks that the team is responsible for.

Infrastructure as Code

This section demonstrates the implementation of a CI/CD pipeline built using GitHub Actions to automate the deployments of AKS and other Azure resources that AKS depends on. This pipeline deploys an AKS infrastructure similar to v1.24.0.0 of the AKS Baseline Reference Implementation using either Biceps or Terraform modules.

Infrastructure-as-Code

Deploy AKS using GitHub Actions and Bicep

Under the IaC/bicep folder you will find the instructions and the code to deploy the AKS Baseline Reference Implementation through a GitHub Actions pipeline leveraging bicep CARML modules. The steps can be found here.

Deploy AKS using GitHub Actions and Terraform (in development)

Under the IaC/terraform folder you will find the instructions and the code to deploy the AKS Baseline Reference Implementation through a GitHub Actions pipeline leveraging CAF Terraform modules. The steps can be found here. This option is still in development.

Shared-Services

This section demonstrates the provisioning of the Shared-Services. These services are the in-cluster common components that are used by all applications running on the cluster. We also provide example of metrics of interest from these Shared-Services that can be captured and surfaced in a dashboard to help with their maintenance.

In this section we demonstrate two implementation options:

The GitOps solution features:

Shared-Services Deployment

Note: in a real world deployment you may want to have a dedicated GitHub repo and an ACR instance for Shared-Services to store artifacts (i.e. manifest files, helm charts and docker images), separating them from the ones used for IaC and the application workloads. For simplicity and convenience sake, we have placed all those artifacts within this same repo but in different folders.

Application Deployment

This section demonstrates the deployment of an application composed of multiple services by leveraging two options:

The application Flask App is used for this deployment as this application is quite simple, but yet demonstrates how to deploy an application composed of multiple containers. In this case the application is composed of a web-front-end written in Python.

Blue/Green and Canary release strategies for this application will also be demonstrated. Note however that this feature has not been implemented yet, see issue https://github.com/Azure/aks-baseline-automation/issues/27.

Deploy sample applications using GitHub Actions (push method)

Multiple GitHub action workflows are used to demonstrate the deployment of sample applications through a CI/CD pipeline (push method). Please click on the links below for instructions on how to use these workflows.

Sample App Scenario Description Tags
Flask Hello World Docker Build Builds a container image from code on the runner then pushes to ACR. Deployment is done via a push model. Requires the use of self-hosted runners if you deployed a private ACR per the instructions in the IaC section of this repo. To setup self-hosted runners, refer to the Self-hosted GitHub Runners section.
Azure Vote AKS Run Command Sample of re-usable workflow called from the workflow App-Test-All.yml. Deploys the app using a helm chart through the AKS Command Invoke. The focus here is to demonstrate how workloads in private clusters can still be managed through cloud hosted GitHub runners (no need to install self-hosted runners as in the other samples). It also shows how to test your application using Playwright.
Azure Vote ACR Build Another Sample of re-usable workflow called from the workflow App-Test-All.yml. Builds a container image from code directly in Azure Container Registry (ACR). Deployment is done using the Azure Kubernetes GitHub actions. Requires the use of self-hosted runners if you deployed a private ACR per the instructions in the IaC section of this repo. To setup self-hosted runners, refer to the Self-hosted GitHub Runners section.

Deploy sample applications using GitOps (pull method)

You can use GitOps with flux or ArgoCD (pull method) as an alternative to GitHub action workflows to deploy your applications.

Refer to these instructions for how to setup your environment to deploy a sample application with GitOps using ArgoCD.

Lifecycle-Management

Different components of an AKS solution are often owned by different teams and typically follow their own lifecycle management schedule and process, sometimes using different tools. In this section we will cover the following lifecycle management processes:

For better security and version control, all these lifecycle management processes need to be git driven so that any change to any component of the AKS solution is done through code from a Git Repository and goes through a review and approval process. For this reason, we will provide two options to automatically carry out these tasks:

Note that these features have not been implemented yet in this reference implementation. For the automation of the cluster lifecycle-management see issue https://github.com/Azure/aks-baseline-automation/issues/23.

Secure DevOps

A typical DevOps process for deploying containers to AKS can be depicted by the diagram below: Typical DevOps

The security team focus is to make sure that security is built into this automation pipeline and that security tasks are shifted to the left and automated as much as possible. They will need for example to work with the different automation teams to make sure that the following controls are in place within their pipelines:

Secure DevOps

In addition to this oversight role, they will also have to build and maintain their own pipeline to automate the management of security related resources outside the clusters (Azure policies, firewall rules, NSGs, Azure RBAC, etc) as well as inside the cluster (Network Security Policies, Service Mesh Authentication and Authorization rules, Kubernetes RBAC, etc).

Incorporate security controls into the devOps pipeline is not implemented yet in this reference implementation, see issue https://github.com/Azure/aks-baseline-automation/issues/25.

GitHub Repo structure

This repository is organized as follow:

AKS Baseline Automation Repo Structure

Self-hosted GitHub Runners

The default deployment methods in this Reference Implementation use GitHub runners hosted in the GitHub Cloud.

For better security, you may want to setup GitHub self-hosted runners locally within your Azure subscription. For example, if you are using private AKS clusters, you will need to use self-hosted runners hosted in an Azure vnet with connectivity to your clusters to be able to run GitHub action workflows to manage those clusters and the workloads that run on them.

For more information about the benefits of self-hosted runners, refer to this article. For instructions on how to setup your own self-hosted runners, refer to this article.

The diagram below depicts how a GitHub runner hosted in your Azure subscription uses a Managed Identity to connect securely to your Azure subscription and make changes to your Azure and Kubernetes resources:

GitHub Runners

Contributing

This project welcomes contributions and suggestions. Please refer to the roadmap for this reference implementation under this repo's Project for the features that are planned. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.