tazama-lf / AKS-helm

AKS Helm charts to install all the applications needed for the FRMSCOE platform
https://frmscoe.github.io/AKS-helm/
Apache License 2.0
0 stars 0 forks source link

AKS Detailed Installation Guide

This article will be an end-to-end guide for installing Tazama to any cluster but only once your AKS infrastructure is setup.

Instructions

Read through the infrastructure spec before starting with the deployment guide.

Infrastructure Spec for Tazama Sandbox

Infrastructure Spec for Tazama

Important: Access to the Tazama GIT Repository is required to proceed. If you do not currently have this access, or if you are unsure about your access level, please reach out to the Tazama Team to request the necessary permissions. It's crucial to ensure that you have the appropriate credentials to access the repository for seamless integration and workflow management.

Azure Container Registry (ACR) Setup

  1. Create an Azure container registry with a name e.g tazama
  2. Add a list of Repositories for the different services listed below.

Our repository list includes a variety of components, each representing specific microservices and tools within our ecosystem. You need to create these in your AWS env in the ECR service.

Repository list:

Default release version: rel-1-0-0 to make e.g rule-001-rel-1-0-0-dev

  1. Log in to registry

Before pushing and pulling container images, you must log in to the registry instance. Sign into the Azure CLI on your local machine, then run the az acr login command. Specify only the registry resource name when logging in with the Azure CLI. Don't use the fully qualified login server name. e.g az acr login --name tazama

  1. Create a token that will be used in the later deployment steps. This token will need to be saved in Jenkins credentials / secrets. The command to use for this will be az acr token Please reference this az acr token doc for further explanation.

Step 1 - Helm charts

Overview

This guide will walk you through the setup of the Tazama (Real-time Antifraud and Money Laundering Monitoring System) on a Kubernetes cluster using Helm charts. Helm charts simplify the deployment and management of applications on Kubernetes clusters. We will deploy various services, ingresses, pods, replica sets, and more.

Prerequisites

Adding Required Namespaces

The installation of our system requires the creation of specific namespaces within your cluster. These namespaces will be automatically created by the infra-chart Helm chart. Ensure these namespaces exist before proceeding:

If they are not created automatically, you can manually add them using the following command for each namespace:

kubectl create namespace <namespace-name>

Helm Repository Setup

First, add the Tazama Helm repository to enable the installation of charts:

The list below are the different helm charts:

  1. NATS
  2. ElasticSearch
  3. `ArangoDB (single deployment)
  4. ArangoDb Ingress Proxy
  5. Jenkins
  6. Redis-Cluster
  7. Nginx ingress
  8. APM (Elasticsearch)
  9. Logstash (Elasticsearch)
  10. Kibana (Elasticsearch)
  11. Infra-chart
  12. Grafana - Optional
  13. `Prometheus - Optional
  14. Vault - Optional
  15. KeyCloak - Optional

Optional - Please note that these are additional features; while not required, they can enhance the platform's capabilities. Implementing them is optional and will not hinder the basic operation or the end-to-end functionality of the platform.

ie: Another HELM chart exists for the clustered version of ArangoDB, as mentioned on the linked page. However, the single deployment version is preferred over the clustered one because it includes functionality that is absent or required in the enterprise option.

Repo

https://github.com/tazama-lf/AKS-helm

Helm Repository Setup

First, add the Tazama Helm repository to enable the installation of charts:

helm repo add Tazama https://tazama-lf.github.io/AKS-helm/
helm repo update

To confirm the Tazama repo has been successfully added:

helm search repo Tazama

Ingress Setup

To expose services outside your cluster, enable ingress on necessary charts:

  1. Kibana
  2. ArangoDb
  3. Jenkins
  4. TMS
helm install kibana Tazama/kibana --namespace=development --set ingress.enabled=true
...

If you prefer not to configure an ingress controller, you can simply use port forwarding to access the front-end interfaces of your applications. This approach will not impact the end-to-end functionality of your system, as it is designed to utilize fully qualified domain names (FQDNs) for internal cluster communication.

Installing Helm Charts

The Tazama system is composed of multiple Helm charts for various services and components. These need to be installed in a specific order due to dependencies.

  1. Infra-chart - Sets up necessary namespaces and storage classes.
helm install infra-chart Tazama/infra-chart
helm repo update
  1. Follow with the installation of other charts as listed, specifying the namespace as required:
helm install nginx-ingress-controller Tazama/ingress-nginx --namespace=ingress-nginx
helm install elasticsearch Tazama/elasticsearch --namespace=development
helm install kibana Tazama/kibana --namespace=development
helm install apm Tazama/apm-server --namespace=development
helm install logstash Tazama/logstash --namespace=development
helm install arangodb-ingress-proxy Tazama/arangodb-ingress-proxy --namespace=development
helm install arango Tazama/arangodb --namespace=development
helm install redis-cluster Tazama/redis-cluster --namespace=development
helm install nats Tazama/nats --namespace=development
  1. We're going to install Jenkins with helm by following the official docs. Take note of post installation notes to retrieve password and port forward.
helm repo add jenkins https://charts.jenkins.io
helm repo update
helm install jenkins jenkins/jenkins --set ingress.enabled=true --namespace=cicd

Accessing Jenkins UI

The following sections of the guide require you to work within the Jenkins UI. You can either access the UI through a doamin if you configured an ingress or by port forwarding.

Port forward Jenkins to be accessible on localhost:8080 by running: kubectl --namespace cicd port-forward svc/jenkins 8080:8080

Get your 'admin' user password by running: kubectl exec --namespace cicd -it svc/jenkins -c jenkins -- /bin/cat /run/secrets/additional/chart-admin-password && echo

Navigate to the Jenkins UI, username admin and retrieved password to login. Go to Manage Jenkins, Under System Configuration, select Plugins and install the Configuration File, Nodejs and Docker plugins that will enable later configuration steps.

For optional components like Grafana, Prometheus, Vault, and KeyCloak, use similar commands if you decide to implement these features.

Extra Information: https://helm.sh/docs/helm/helm_install/

Uninstalling Charts

If you need to remove the Tazama deployment:

helm uninstall Tazama

Step 2 - Configuration

For a system utilizing a variety of Helm charts, optimizing performance, storage, and configuration can significantly impact its efficiency and scalability. Below are details on how to configure and optimize each component for your Tazama system:

1. NATS

2. ElasticSearch

3. ArangoDB (Single Deployment)

4. ArangoDB Ingress Proxy

5. Jenkins

6. Redis-Cluster

7. Nginx Ingress Controller

8. APM, Logstash, Kibana (Elasticsearch)

9. Grafana

10. Prometheus

11. Vault

12. KeyCloak

Each of these components plays a critical role in the Tazama system. By carefully configuring and optimizing them according to the guidelines provided, you can ensure that your system is secure, scalable, and performs optimally. Always refer to the official documentation for the most up-to-date information and advanced configuration options.

Step 3: Post-Installation Configuration

Elasticsearch Kube Secret for LumberJack

In order to get the processor pods to write logs to the lumberjack deployment which then writes the log information to elasticsearch.

Logging Data View

  1. There is a secret that is created for elasticsearch after the HELM install. Duplicate the one created by elasticsearch.
  2. Change the namespace for development to processor

Example

apiVersion: v1
kind: Secret
metadata:
  name: elasticsearch-master-certs
  namespace: processor
type: kubernetes.io/tls
data:
  ca.crt: >-
  tls.crt: >-
  tls.key: >-

Setting up TLS for Ingress

Secure your ingress with TLS by creating a tlscomsecret in each required namespace:

You can generate a self-signed certificate and private key with this command;

openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout tls.key -out tls.crt -subj "/CN=${HOST}/O=${HOST}" -addext "subjectAltName = DNS:${HOST}"

  1. Create a secret with your TLS certificate and key;
apiVersion: v1
kind: Secret
metadata:
  name: tlscomsecret
  namespace: development
type: kubernetes.io/tls
data:
  tls.crt: <base64-encoded-cert>
  tls.key: <base64-encoded-key>

Or

You can use kubectl to create the secret by running the command below;

kubectl create secret tlscomsecret ${CERT_NAME} --key tls.key --cert tls.crt -n development

  1. Apply this configuration for each relevant namespace (development, processor, cicd, default).

Configuring Ingress Domain Names

Customize your ingress resources to match your domain names and assign them to the nginx-ingress-controller's IP address:

Using example.test.com as an example

Please see the example below:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
`  name: cicd-ingress
  namespace: cicd
  annotations:
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/backend-protocol: HTTP
    nginx.ingress.kubernetes.io/cors-allow-headers: X-Forwarded-For
    nginx.ingress.kubernetes.io/proxy-body-size: 50m
    nginx.ingress.kubernetes.io/use-regex: 'true'
...
spec:
  tls:
  - hosts:
    - example.test.com
      secretName: tlscomsecret
  rules:
  - host: example.test.com
      ...

Please see the TMS example below:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: test-ingress
  namespace: processor
  annotations:
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/backend-protocol: HTTP
    nginx.ingress.kubernetes.io/cors-allow-headers: X-Forwarded-For
    nginx.ingress.kubernetes.io/proxy-body-size: 50m
    nginx.ingress.kubernetes.io/use-regex: 'true'
spec:
  tls:
    - hosts:
        - example.test.com
      secretName: tlscomsecret
  rules:
    - host: example.test.com
      http:
        paths:
          - path: /execute
            pathType: ImplementationSpecific
            backend:
              service:
                name: transaction-monitoring-service-rel-1-0-0
                port:
                  number: 3000
          - path: /
            pathType: ImplementationSpecific
            backend:
              service:
                name: transaction-monitoring-service-rel-1-0-0
                port:
                  number: 3000
          - path: /natsPublish
            pathType: ImplementationSpecific
            backend:
              service:
                name: nats-utilities-rel-1-0-0
                port:
                  number: 3000

Vault Configuration

After installing the Vault chart, you'll need to initialize and unseal Vault manually. The process involves generating unseal keys and a root token which you'll use to access and configure Vault further.

For now vault has been integrated into the system but the Jenkins variables haven't been added to vault to be pull through . This will be done sometime.

https://developer.hashicorp.com/vault/docs/concepts/seal

Vault_Operator_Init Sign_In_With_Token

Logstash Configuration

If LogLevel is set to info, error etc.. in your Jenkins environment variables then you will need configure this.

For comprehensive instructions on how to configure logging to Elasticsearch, please refer to the accompanying document. It provides a step-by-step guide that covers all the necessary procedures to ensure your logging system is properly set up, capturing and forwarding logs to Elasticsearch. This includes configuring log shippers, setting up Elasticsearch indices, and establishing the necessary security and access controls. By following this documentation, you can enable efficient log management and monitoring for your services.

Logging Data View

APM Configuration

If APMActive is set to true (default:true) in your Jenkins environment variables then you will need configure this

Once configured, the APM tool will begin collecting data on application performance metrics, such as response times, error rates, and throughput, which are critical for identifying and resolving performance issues. The collected data is sent to the APM server, where it can be visualized and analyzed. For detailed steps on integrating and configuring APM with your Jenkins environment, please refer to the specific APM setup documentation provided in your APM tool's resources.

Setting Up Elastic APM

Jenkins Configuration

Accessing Jenkins UI

The following sections of the guide require you to work within the Jenkins UI. You can either access the UI through a doamin if you configured an ingress or by port forwarding.

Port forward Jenkins to be accessible on localhost:8080 by running: kubectl --namespace cicd port-forward svc/jenkins 8080:8080

Get your 'admin' user password by running: kubectl exec --namespace cicd -it svc/jenkins -c jenkins -- /bin/cat /run/secrets/additional/chart-admin-password && echo

Adding Credentials in Jenkins

Credentials are critical for Jenkins to interact with other services like source control management systems (like GitHub), container registries, and Kubernetes clusters securely. Jenkins provides a centralized credentials store where you can manage all these credentials. Here's a step-by-step guide based on the images you've provided:

GitHub Credentials

  1. Navigate to Manage Jenkins → Credentials → System → Global credentials (unrestricted).
  2. Click on Add Credentials.
  3. `Select Username with password from the drop-down menu.
  4. Enter your GitHub username.
  5. Enter your GitHub password. If two-factor authentication is enabled, you'll need to use a personal access token in place of your password.
  6. Set the ID to something memorable, like Github.
  7. Optionally, provide a description like GitHub Credentials.
  8. Click Save.

Container Registry Credentials

  1. Follow the first two steps as above to navigate to the Add Credentials page.
  2. Select Username with password.
  3. `Input the username for your container registry.
  4. Enter the corresponding password or access token for the registry. This token was gotten during repository creation
  5. Assign a unique ID, such as ContainerRegistry.
  6. Include a description that helps identify the registry, like Login info for the container registry.
  7. Click Save.

GitHub Read Package Credentials

  1. Again, follow the initial steps to reach the Add Credentials page.
  2. Select Secret text.
  3. `Enter the personal access token you've created on GitHub with the necessary scopes to read packages.
  4. Set the ID, for example, githubReadPackage.
  5. In the description, note the purpose, such as GitHub package read access.
  6. Click Save.

Kubernetes Credentials

To configure Jenkins to use Kubernetes secrets for authenticating with Kubernetes services or private registries, you can follow these steps, similar to setting up GitHub package read access:

  1. Retrieve the Kubernetes Token:
  1. Add Secret in Jenkins:
  1. Configure the Credential ID:
  1. Add a Description:
  1. Save the Configuration:

Following this process will allow Jenkins jobs to authenticate with Kubernetes using the token stored in the secret, enabling operations that require Kubernetes access or pulling images from private registries linked to your Kubernetes environment.

image-20240215-150928.png image-20240215-151159.png

Adding Managed Files for NPM Configuration in Jenkins

Navigate to Manage Jenkins → Managed files

https://docs.github.com/en/packages/learn-github-packages/about-permissions-for-github-packages#about-scopes-and-permissions-for-package-registries

https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/creating-a-personal-access-token

image-20240212-124706.png

The image shows a Jenkins configuration screen for adding a managed file, specifically an NPM config file (npmrc). Here's a breakdown of the steps and fields:

  1. Managed Files: This section is for adding configuration files that Jenkins will manage and use across different jobs.
  2. ID: A unique identifier for the managed file.
  3. `Name: In this case, it's npmrcRulesCredentials. This name helps users identify the file's purpose when selecting it for use in a job.
  4. Comment: An optional field where you can provide additional information about the managed file, such as its intended use. Here, it is described as 'user config'.
  5. Enforce NPM version 9 registry format: A checkbox that, when checked, enforces the file to be compatible with the NPM version 9 registry format.
  6. Add NPM Registry:
  1. Content: The text area labeled 'Content' is where you can input the actual content of the .npmrc file. This content typically includes configuration settings like the registry URL, authentication tokens, and various other npm options. always-auth = false will not be always required (usually for public registries).
  2. Add: After configuring all the fields, you would click "Add" to save this managed file configuration.

Once you've added this managed file, Jenkins can use it in various jobs that require npm to access private packages or specific registries. The managed file will be placed in the working directory of the job when it runs, ensuring that npm commands use the provided configuration.

Jenkins Node.js Configuration

Navigate to Manage Jenkins → Tools

image-20240215-061559.png

Jenkins Docker Installation Configuration

Navigate to Manage Jenkins → Tools

image-20240213-103453.png

Building Jenkin Agent Locally

This needs to be completed before adding the Jenkins Cloud agent.

Please follow the following document to help you build and push the image to the container registry.

Building the Jenkins Agent Image

Setting up a Jenkins cloud agent that will interact with your Kubernetes cluster

Navigate to Manage Jenkins → Clouds → Kubernetes settings

image-20240212-115316.pngimage-20240212-111931.png

image-20240212-112102.png

Add a Container: In this part of the configuration, you define the container that will run inside the pod created from the pod template.

NOTE This needs to point to the docker image built in this step : Building the Jenkins Agent Image

image-20240212-115159.png

Run in Privileged Mode: This is an advanced container setting that allows processes within the container to execute with elevated privileges, similar to the root user on a Linux system.

To select "Run in Privileged Mode" in Jenkins Kubernetes plugin:

  1. Within the container configuration, look for the "Advanced..." button or link (not visible in the screenshot) and click it to expand the advanced options.
  2. In the advanced settings, find the checkbox labeled "Run in privileged mode" and select it.

image-20240212-114225.png

Image Pull Secret

Needs to be set to - frmpullsecret - see screenshot below

  1. Private Registry Authentication: If the container images used by your Jenkins jobs are hosted in a private registry, Kubernetes needs to authenticate with that registry. The image pull secret stores the required credentials (like a username and password or token).

  2. Adding Image Pull Secret to Pod Template:

    • Navigate to the Kubernetes cloud configuration within the Jenkins system settings.
    • Under the specific pod template that you are configuring, find the ImagePullSecrets section.
    • Enter the name of the Kubernetes secret that contains your private registry credentials in the Name field. This secret should already exist within the same namespace as where your Jenkins builder pods are running. Value of the secret is frmpullsecret
    • If you have multiple registries or need to pull from multiple private sources, you can add additional image pull secrets by clicking on the “Add Image Pull Secret” dropdown and entering the names of these secrets.
  3. YAML Merge Strategy: The YAML merge strategy determines how Jenkins should handle the YAML definitions from inherited pod templates. If set to 'Override', it means that the current YAML will completely replace any inherited YAML, which could be important if you need to ensure that the image pull secrets are applied without being altered by any inherited configurations.

By properly configuring image pull secrets in your Jenkins Kubernetes pod templates, you enable Jenkins to pull the necessary private images to run your builds within the Kubernetes cluster. Without these secrets, the image pull would fail, and your builds would not be able to run.

image-20240215-144955.png

Steps to Configure Jenkins Global Variables

  1. Accessing Global Configuration:
  1. Updating Environment Variables:

Passwords: These passwords can be found in your Kubernetes Cluster Secrets, which are autogenerated when the HELM installations are carried out.

Multiple ArangoDB passwords and endpoints: The reason we have different names and passwords for ArangoDB is to keep things organized and safe. Each name points to a different part of the database where different information is kept. Just like having different keys for different rooms. This is useful when you have more than one ArangoDB running at the same time and you want to keep them separate. This way, you can connect to just the part you need.

If you have a single database instance you may be wondering why multiple password variants are needed. For example, if my Configuration, Pseudonyms and TransactionHistory databases are served from the same Arango instance, why must I include single quotes in their password input whereas that requirement was not needed in the ArangoPassword variable.

The ArangoPassword variable is utilised as a CLI argument by newman, for setting up the environment. Where it is called, there is some shell substitution of the ArangoPassword variable but because the substitution involves a special character, $, that has to be surrounded by quotes. newman {omitted} "arangoPassword=${ArangoPassword}" --disable-unicode

The same reasoning applies to passwords are that explicitly stated to need a single quote around them as they are substituted as is in processors' environments. This means that if your password contains special characters, then you must use single quotes to let the decoder know to interpret them as raw strings, or it will be taken as an indication of substitution.

  1. Variables and Descriptions:

Adding Jenkins Jobs

Download the Job Configurations:

jobs.zip

Navigate to Configuration Directory:

cd <path to configuration>

eg: cd "C:\Documents\tasks\Jenkins\jobs"

image-20240220-055517.png

Copy Jobs to Jenkins Pod**:

kubectl cp . <name of pod>:/var/jenkins_home/jobs/ -n cicd

Finalize the Setup

Reload Jenkins Configuration

kubectl rollout restart deployment <jenkins-deployment-name> -n cicd

eg: http://localhost:52933/safeRestart

image-20240215-054140.png

Step 4: Running Jenkins Jobs to Install Processors

Overview

The process involves configuring Jenkins to deploy various processors into the Tazama cluster. These processors are essential components of the system and require specific configurations, such as database connections and service endpoints, to function correctly.

Populating ArangoDB:

Dashboard → Deployments→ ArangoDB

Run the Create Arango Setup and then Populate Arango Configuration jobs to populate the ArangoDB with the correct configuration required by the system. This job would utilize the variables set in the global configuration to connect to ArangoDB and perform the necessary setup.

image-20240215-052351.png

Edit Jobs: Configuring Credentials and Kubernetes Endpoints in Jenkins

After importing the Jenkins jobs, you need to configure each job with the appropriate credentials and Kubernetes server endpoint details. This setup is crucial to ensure that each job has the necessary permissions and access to interact with other services and the Kubernetes cluster.

Configuring Rule Processors:

  1. Access Each Rule Processor Job:

    • Navigate to the job configuration for each rule processor, such as TMS, Typology, etc. Click on configure to be able to edit
    • Within each job, look for the section where you can define or edit the repository from which the job will fetch the code or artifacts.
  2. Repository Configuration:

    • Set the Repository URL to the Git repository where the code for the processor is located. This is typically a URL like https://github.com//event-director/.

    • Since Tazama services codebase lives in 2 github organization accounts, you'll need to change $Repository for rule-processors to be tazama-lf. It's okay to hardcode this.

    • Under Credentials, select the appropriate credentials from the drop-down list, such as Github Creds, which should correspond to the credentials that have access to the repository.

  3. Kubernetes Configuration:

NOTE- The Kubernetes server endpoint can be copied from your .kubeconfig file under cluster -> server

  1. Binding Credentials:

By completing these steps, you ensure that each Jenkins job can access the necessary repositories and services with the correct permissions and interact with your Kubernetes cluster using the right endpoints and credentials. It's essential to review and verify these settings regularly, especially after any changes to the credentials or infrastructure.

image-20240213-064147.png image-20240213-064707.png image-20240213-064256.png

Deploying to the Cluster:

Dashboard → Deployments→ Pipelines→ Deploying All Rules and Rule Processors

Run the Jenkins jobs that deploy the processors to the Tazama cluster. These jobs will reference the global environment variables you've configured, ensuring that each processor has the required connections and configurations.

Run the Deploying All Rules and Rule Processors Pipeline Job

image-20240212-161403.png

End-to-End Platform Testing with the "E2E Test" Jenkins Job

Dashboard → Testing→ E2E Test

Overview of the "E2E Test" Job

The "E2E Test" job in Jenkins is an essential component for ensuring the integrity and robustness of the platform. It is specifically designed to perform comprehensive end-to-end testing, replicating user behaviors and interactions to verify that every facet of the platform functions together seamlessly.

Purpose and Benefits

Running the Test and Post-Test Evaluation

image-20240213-122427.png

Common Errors

image-20240213-144301.png

Arango ingress error

To resolve this issue, you would need to:

  1. Ensure that the tlscomsecret secret contains the necessary TLS certificates and keys.
  2. Add the tlscomsecret to the development namespace, if it's not already present.
  3. `After the secret is correctly placed in the namespace, restart the affected pod by deleting the existing pod. Kubernetes will automatically spin up a new pod which should now successfully mount the required volumes, including the TLS secrets, and run as expected.

image-20240215-142735.png image-20240215-142727.png

Network Access Error in Container Deployment

To address the network access error encountered when deploying containers that require communication with arango.development.svc, follow these steps:

  1. Verify that the network policies and service discovery configurations are correctly set up within your cluster to allow connectivity to arango.development.svc.
  2. If your deployment is within a Kubernetes environment and you're using network namespaces, consider enabling the Host Network option. This grants the pod access to the host machine's network stack, which can be necessary if the service is only resolvable or accessible in the host's network:

Implementing these steps should help in resolving connectivity issues related to the arango.development.svc hostname not being found, facilitating successful POST requests to the specified endpoints.

image-20240215-143113.png

Addressing Pod Restart Issues in Kubernetes

If you are experiencing problems with your Kubernetes pods that may be related to environmental variables or configuration issues, such as frequent restarts or failed connections to services like ArangoDB, follow these steps to troubleshoot and resolve the issue:

  1. Check the Environment Variables in Jenkins:

    • Ensure that all required environment variables are properly set in Jenkins. These variables might include database connection strings, service endpoints, credentials, or other configuration parameters necessary for your application to run correctly.
    • Review the build and deployment scripts in Jenkins to confirm that the environment variables are being injected into the deployment manifests or pod configurations.
  2. Verify ArangoDB Configuration:

    • Double-check the ArangoDB configuration to ensure that it is correct and aligns with the requirements of your application. This may include database URLs, user credentials, database names, and any other related configuration details.
    • If you are using Kubernetes ConfigMaps or Secrets to manage the ArangoDB configuration, make sure they are correctly defined and mounted into your pods.
  3. Monitor Pod Status and Logs:

    • Observe the status of the pods through the Kubernetes dashboard or using kubectl get pods command. Take note of any pods that are in a CrashLoopBackOff state or that are frequently restarting.
    • Use kubectl describe pod <pod-name> to get more details about the pod's state and events that might indicate what is causing the restarts.
    • Examine the logs of the restarting pods using kubectl logs <pod-name> to look for any error messages or stack traces that could point to a configuration problem or a missing environment variable.
  4. Address Possible Configuration Drifts:

    • In a dynamic environment like Kubernetes, configuration drifts can occur where the running state of the system deviates from the defined state. Ensure that all deployments, StatefulSets, or other controller resources match the intended configuration.
  5. Update and Restart Pods if Necessary:

    • Once any necessary changes have been made to the environment variables or ArangoDB configuration, update the relevant Kubernetes resources.
    • You can restart the affected pods to apply the changes by deleting them and letting the ReplicaSet create new ones with the correct configuration.

By carefully checking your Jenkins environment variables and ensuring the ArangoDB configuration is correct, you can resolve issues leading to pod instability and ensure that your services run smoothly in the Kubernetes environment.

image-20240215-143411.png image-20240215-143443.png image-20240215-143505.png

Addressing Jenkins Build Authentication Errors

When encountering authentication errors during a Jenkins build process that involve Kubernetes plugin issues or Docker image push failures, follow these troubleshooting steps:

  1. Kubernetes Plugin Error:

    • The error message suggests a NullPointerException, which is often due to missing or improperly configured credentials within Jenkins. This could be an issue with the Kubernetes plugin configuration where a required value is not being set, resulting in a null being passed where an object is expected.
    • Review your Jenkins job configurations and ensure that all the Kubernetes-related credentials are correctly set and that the plugin is properly configured.
    • If you are using credential substitution (injected variables), ensure that the substitutions are correctly configured. If necessary, as per the provided instruction, deselect all credential substitutions to see if this resolves the error. This can help isolate the issue by reverting to default or hardcoded credentials, which can then be individually reinstated to identify the problematic substitution.
  2. Docker Image Push Error:

    • The failure to push a Docker image to a registry, with an error indicating "unable to retrieve auth token," typically points to incorrect credentials being used for Docker registry authentication.
    • Confirm that the Docker registry credentials set in Jenkins are accurate. You may need to update the username and password or use an access token if the registry requires it.
    • Ensure that the credentials are correctly mapped in the Jenkins job and that any credential substitution is correctly applied.
  3. Rerun the Jenkins Jobs:

    • After making the necessary corrections, rerun the Jenkins jobs to confirm that the issue is resolved.
      • Monitor the build output for any further authentication-related errors and address them as needed.
  4. Additional Steps:

    • If the issue persists, consider regenerating or re-obtaining the necessary credentials and updating them in Jenkins.
      • Check the Jenkins system logs and the specific job's console output for more detailed error messages that can provide additional context for the failure.

By following these steps, you can address the authentication issues that are causing the Jenkins build process to fail, ensuring a successful connection to Kubernetes and Docker registry services.

Jenkins Build Agent terminating and restarting

If for some reason the jenkins agent starts up on your kubernetes instance and then termnates and restarts. You might need to change to frmpullsecret with namespacecicdto .dockerconfigjson data.

Docker Config JSON: Understanding the auth Field

The auth field in the .dockerconfigjson file is a base64-encoded string that combines your Docker registry username and password in the format username:password. Here's how you can construct it:

Steps to Construct the auth Field

  1. Combine the Username and Password

    Format the string as username:password. For example, your username is frms and your password is yourpassword.

  2. Base64 Encode the String

You can use a command-line tool like base64 or an online base64 encoder to encode the string.

Using a command-line tool:

echo -n 'frms:yourpassword' | base64

This will produce a base64-encoded string, which you then place in the auth field.

Here is an example of what the .dockerconfigjson data in the secret file might look like after encoding:

{"auths":{"registory":{"username":"frms","password":"token","email":"no@email.local","auth":"QVdTOnlvdXJwYXNzd29yZA=="}}}

Please see the example below:

apiVersion: v1
kind: Secret
metadata:
  name: frmpullsecret
  namespace: cicd
type: kubernetes.io/dockerconfigjson
data:
  .dockerconfigjson: >-

Conclusion: Finalizing Tazama System Installation

With the Helm charts and Jenkins jobs successfully executed, your Tazama (Real-time Monitoring System) should now be operational within your Kubernetes cluster. This comprehensive setup leverages the robust capabilities of Kubernetes orchestrated by Jenkins automation to ensure a seamless deployment process.

As you navigate through the use and potential customization of the Tazama system, keep in mind the importance of maintaining the configurations as documented in this guide. Regularly update your environment variables, manage your credentials securely, and ensure that the pipeline scripts are kept up-to-date with any changes in your infrastructure or workflows.

Should you encounter any issues or have questions regarding the installation and configuration of the Tazama system, support is readily available. You can reach out via email or join the dedicated Slack workspace for collaborative troubleshooting and community support.

For direct assistance:

Joining the Tazama CoE workspace on Slack will connect you with a community of experts and peers who can offer insights and help you leverage the full potential of your Tazama system. Always ensure that you are working within secure communication channels and handling sensitive information with care.