Tech-Modernization / helmfile-infra

helmfile project to deploy prometheus-operator, sonarqube, and other infrastructure tools
0 stars 0 forks source link


Deploys kubernetes infrastructure components to multiple kubernetes environments using helm and helmfile.

Helmfile is being used to declaratively manage the configuration and deployment of these charts across multiple environments. Helmfile is like helm for helm. It allows a gitflow approach to using helm chart releases, where the release configuration customization is driven by files in git such as changes to yaml values files.


Charts commented out


Additional kubernetes cluster including on-prem Anthos, Azure, AWS and others can be accommodated with minimal effort.

Instructions are provided on setuping up k8s cluster using

Prerequisites (mac)

# brew install kubernetes-helm
# helm version
version.BuildInfo{Version:"v3.0.1", GitCommit:"7c22ef9ce89e0ebeb7125ba2ebf7d421f3e82ffa", GitTreeState:"clean", GoVersion:"go1.13.4"}
# brew install helmfile
# helmfile --version
helmfile version v0.100.2
# helm plugin install 
# brew install jsonnet
# pip install pyyml (python2)
# brew install stern
# brew install kubectx
# brew install sops
# helm plugin install 
# brew install gnu-getopt
# brew install cfssl

kube credentials

Kubernetes creadentials for the appropriate environement must be used with helmfile/helm/kubectl.

For vk8, a KUBECONFIG file for the myapp-prometheus service account has been generated by the k8s admin.
This can be used by setting the environment var:

# export KUBECONFIG=/path/to/kubeconfig_serviceaccount

Note that kubens, kubectx and other k8s tools might impact usage.

Alternatively, the helmfile --kube-context can be used to specify credentials.

For gcp, you can setup k8s gke credentials using a command like:

gcloud container clusters get-credentials acme --zone us-central1-c --project bhood-214523


All required helm charts and configurations are defined in helmfile.yaml. You must specify the environment using -e or --environment flag.

Lint helmfile/charts:

# helmfile --environment ldev lint

Diff between your kubernetes cluster and the helmfile:

# helmfile --environment ldev diff

To apply the changes with a decision gate to validate:

# helmfile --environment lprod --interactive apply

To apply the changes without doing a diff or decision gate:

# helmfile --environment gcp sync

Secret setup

Prometheus Operator and PushGateway

Prometheus Operator

The Prometheus Operator provides easy monitoring for k8s services and deployments besides managing Prometheus, Alertmanager and Grafana configuration while preserving configurability as well as making the configuration Kubernetes native.

When you deploy a new version of your app, k8s creates a new pod (container) and after the pod is ready k8s destroy the old one. Prometheus is on a constant vigil, watching the k8s api and when detects a change it creates a new Prometheus configuration, based on the services (pods) changes.


Prometheus-operator uses a Custom Resource Definition (CRD), named ServiceMonitor, to abstract the configuration to target. As an example below, let’s see how to monitor a NGINX pod with ServiceMonitor. The ServiceMonitor will select the NGINX pod, using the matchLabels selector. The prometheus-operator will search for the pods based on the label selector and creates a prometheus target so prometheus will scrape the metrics endpoint.

Environment differences

In vk8, the full Prometheus Operator (including CED/operator, AlertManger, Prometheus, and Grafana) are deploy by the vk8 platform. These are used to self-monitor the k8s cluster, and to monitor platform dependencies such as Jenkins, ELK, and Kafka. Deployment in this cluster deploys the Promwetheus Operator helm chart with the CRD/operator and AlertManager disabled, as these are shared by the platform Prometheus. Only Prometheus and Grafana are installed in a myapp-prometheus namespace using a myapp-prometheus serviceaccount.

Ingress Endpoints

Endpoints follow a naming convention where the "gcp" portion above is the environment, which is one of lprod, ldev, gcp

Sidecar grafana dashboards

Grafana dashboards (and other sidecar dashboards) can be added using a config map with the label 'grafana_dashboard'. Dashboards in config map with this label will automatically be discovered and added to grafana.

To add or modify a dashboard:

Datasource configuration

Grafana datasources often include secrets which should not be checked into source control (git or bitbucket).

Environment-specific datasources are in config/myappw-prometheus-operator/{{Environment.Name}}.yaml.gotmpl

These use environment-specific environment secrets in environments/{{Environment.Name}}/secrets.yaml

Grafana custom plugins

Because of network restrictions, the k8s cluster might not be able to reach to download and install plugins. The preferred approach for security and docker image startup speed is to preinstall plugins on the docker image and store the docker images in the local repo per
git clone
cd grafana
# git checkout 6.6.1
cd packaging/docker/custom
docker build \
--build-arg "GRAFANA_VERSION=latest" \
--build-arg "GF_INSTALL_PLUGINS=grafana-piechart-panel,grafana-clock-panel,grafana-simple-json-datasource" \
-t dockerhub.artifactory.{{domain}}/grafana/grafana:6.6.2-1  -f Dockerfile .
docker login dockerhub.artifactory.{{domain}}
docker push dockerhub.artifactory.{{domain}}/grafana/grafana:6.6.2-1

docker run -d -p 3000:3000 --name=grafana dockerhub.artifactory.{{domain}}/grafana/grafana:6.6.2-1

extra (secret) files

Some extra files are needed for configuration that include secrets which should not be checked into source control (git or bitbucket).

To add or modify extra:

Shared AlertManager

In vk8 environments, the alertmanager and prometheus operator are install and operated by the cluster administrators. The shared alertmanager is shared with myapp prometheus and potentially other cluster tenants (myapp). AlertManager uses calert and is configured for google chat, and configuration must be coordinated with platform operators.

Alertmanager includes receiver configuration for gchat-notify (using calert) and route configuration for targeting specific receivers has on alert labels. severity and profile (environment) labels are used to target specific rooms. Also, alert annotations description, message, runbook_url and link are used customize notification messages.

The myapp-prometheus-operator standard alerts includes PrometheusNotConnectedToAlertmanagers which should be manually disabled for now (or configure for Blackhole). Future helm chart version could be used to eliminated this alert when the alertmanager is not included in the release and/or modified to match the tags of the configured shared alertmanager.

SLO dashboard as code jsonnet

SLO dashboards are generated using jsonnet using a IaC approach.
helmfile hooks trigger generation of prometheus rules, prometheus alerts, and grafana dashboards for kubeapi and myappapi specs, and these are deployed to prometheus. A standardized RED method (Request Rate, Errors, Duration) using a data-driven IaC approach based on

See for more info


SonarQube is an open-source continous code inspection tools which empowers developers to write cleaner and safer code.


Vault and etcd-operator

Hashicorp Vault, backed by etcd

kubectl exec -ti -n vault vault-0 -- vault operator init

keep the 5 unseal keys and the inital root token somewhere safe

on each of vault-0, 1, 2 you must provide at least 3 unseal keys to each

kubectl exec -ti -n vault vault-0 -- vault operator unseal  XXXX

Ingress Endpoints

Confluent helm

Strimzi Kafka Operator


ServiceAccount and Namespace setup

These charts use serviceaccounts and namespaces created by the cluster administratory beforehand

For gcp, the cluster, service accounts, and namespaces must be setup outside of this helmfile project

kubectl create ns nginx
kubectl create ns prometheus
kubectl create ns sonarqube
kubectl create ns myapp-prometheus
kubectl create ns devops
kubectl create ns cp
kubectl create ns vault
kubectl create ns kafka
kubectl create ns my-kafka-project

Also, compute static IP and DNS records to support Ingress must also be done

It may be necessary to setup firewall access from GKE control plane to your cluster nodes.

VPC_NETWORK=$(gcloud container clusters describe $CLUSTER_NAME --region $CLUSTER_REGION --format='value(network)')
MASTER_IPV4_CIDR_BLOCK=$(gcloud container clusters describe $CLUSTER_NAME --region $CLUSTER_REGION --format='value(privateClusterConfig.masterIpv4CidrBlock)')
NODE_POOLS_TARGET_TAGS=$(gcloud container clusters describe $CLUSTER_NAME --region $CLUSTER_REGION --format='value[terminator=","](nodePools.config.tags)' --flatten='nodePools[].config.tags[]' | sed 's/,\{2,\}//g')


gcloud compute firewall-rules create "allow-apiserver-to-admission-webhook-8443" \
      --allow tcp:8443 \
      --network="$VPC_NETWORK" \
      --source-ranges="$MASTER_IPV4_CIDR_BLOCK" \
      --target-tags="$NODE_POOLS_TARGET_TAGS" \
      --description="Allow apiserver access to admission webhook pod on port 8443" \
      --direction INGRESS


gcloud container clusters describe acme --region us-central1-c | yq r - ipAllocationPolicy.clusterIpv4CidrBlock
gcloud compute firewall-rules list \
    --filter 'name~^gke-acme' \
    --format 'table(
gcloud compute firewall-rules create allow-apiserver-to-admission-webhook-8443 \
    --action ALLOW \
    --direction INGRESS \
    --source-ranges \
    --rules tcp :8443 \
    --target-tags gke-acme-f12e5ab7-node
