keptn-sandbox / litmus-service

Integration for LitmusChaos
Apache License 2.0
7 stars 5 forks source link

Litmus Service

GitHub release (latest by date) Go Report Card

This service provides a way to perform chaos tests on your applications triggered by Keptn using the LitmusChaos framework. Learn more about this integration in our 2-part blog series: part 1, part 2.

Compatibility Matrix

Keptn Version litmus-service Docker Image
0.7.1 keptnsandbox/litmus-service:0.1.0
0.7.2 keptnsandbox/litmus-service:0.1.0
0.7.3 keptnsandbox/litmus-service:0.1.1
0.8.0-0.8.3 keptnsandbox/litmus-service:0.2.0
0.8.4-0.8.5 keptnsandbox/litmus-service:0.2.1
0.19.0 keptnsandbox/litmus-service:0.2.2

Prerequisites

The Keptn litmus-service requires the following prerequisites to be setup on the Kubernetes cluster for it to run the chaos tests:

Execute the following commands to setup these dependencies for a demo setup:

kubectl apply -f ./test-data/litmus/litmus-operator-v2.13.0.yaml
kubectl apply -f ./test-data/litmus/pod-delete-ChaosExperiment-CR.yaml 
kubectl apply -f ./test-data/litmus/pod-delete-rbac.yaml 

Keptn CloudEvents

This service reacts on the following Keptn CloudEvents (see deploy/service.yaml):

Notes:

Installation - Deploy in your Kubernetes cluster

To deploy the current version of the litmus-service in your Keptn Kubernetes cluster, clone the repo and apply the deploy/service.yaml file:

kubectl apply -f deploy/service.yaml

This will install the litmus-service into the keptn namespace, which you can verify using:

kubectl -n keptn get deployment litmus-service -o wide
kubectl -n keptn get pods -l run=litmus-service

Usage

To make use of the Litmus service, a dedicated experiment.yaml file with the actual chaos experiment has to be added to Keptn (for the service under test).

You can do this via the Keptn CLI, please replace the values for project, stage, service and resource with your actual values. But note that the resourceUri has to be set to litmus/experiment.yaml.

keptn add-resource --project=litmus --stage=chaos --service=carts --resource=litmus/experiment.yaml --resourceUri=litmus/experiment.yaml 

Please note that it is recommended to run the chaos experiment along with some load testing. Now when a send-test event is sent to Keptn, the chaos test will be triggered along with the load tests. Once the load tests are finished, Keptn will do the evaluation and provide you with a result. With this you can then verify if your application is resilient in the way that your SLOs are still met.

How does the service work?

The service implements handlers for triggering the chaos tests in the "testing phase" of Keptn, that means that Keptn will trigger the chaos tests right after deployment. The test is executed by a set of chaos pods (notably, the chaos-runner & experiment pod) and the test results stored in a ChaosResult custom resource. The duration of the test & other tunables can be configured in the ChaosEngine resource. Refer to the Litmus docs on supported tunables. Litmus ensures that the review app/deployment is restored to it's initial state upon completion of the test.

The Keptn litmus-service also conditionally generates & handles the test.finished event by cleaning up residual chaos resources (running or completed) in the cluster.

It is a standard practice to execute the chaos tests in parallel with other performance/load tests running on the AUT. The subsequent quality gate evaluations in such cases are more reflective of real world outcomes.

Note: The sample project provided in this repo (in the test-data folder), uses a jmeter load test against the AUT, carts, running in parallel with the pod-delete chaos test.

Uninstall - Delete from your Kubernetes cluster

To delete the litmus-service, delete using the deploy/service.yaml file:

kubectl delete -f deploy/service.yaml

Upgrade or Downgrading

Adapt and use the following command in case you want to upgrade or downgrade your installed version (specified by the $VERSION placeholder):

kubectl -n keptn set image deployment/litmus-service litmus-service=keptnsandbox/litmus-service:$VERSION --record

Configuring the Service

Development

Development can be conducted using any Golang compatible IDE/editor (e.g., Jetbrains GoLand, VSCode with Go plugins).

It is recommended to make use of branches as follows:

When writing code, it is recommended to follow the coding style suggested by the Golang community.

Where to start

If you don't care about the details, your first entrypoint is eventhandlers.go. Within this file you can add implementation for pre-defined Keptn Cloud events.

To better understand Keptn CloudEvents, please look at the Keptn Spec.

If you want to get more insights, please look into main.go, deploy/service.yaml, consult the Keptn docs as well as existing Keptn Core and Keptn Contrib services.

Common tasks

Testing Cloud Events

We have dummy cloud-events in the form of RFC 2616 requests in the test-events/ directory. These can be easily executed using third party plugins such as the Huachao Mao REST Client in VS Code.

Automation

GitHub Actions: Automated Pull Request Review

This repo uses reviewdog for automated reviews of Pull Requests.

You can find the details in .github/workflows/reviewdog.yml.

GitHub Actions: Unit Tests

This repo has automated unit tests for pull requests.

You can find the details in .github/workflows/CI.yml.

GitHub Actions: Build Docker Images

This repo uses GH Actions to automatically build docker images.

The following secrets need to be added on your repository secrets:

Furthermore, the variable IMAGE needs to be configured properly in .ci_env

IMAGE=keptnsandbox/litmus-service 

How to release a new version of this service

It is assumed that the current development takes place in the master branch (either via Pull Requests or directly).

To make use of the built-in automation using Travis CI for releasing a new version of this service, you should

If any problems occur, fix them in the release branch and test them again.

Once you have confirmed that everything works and your version is ready to go, you should

License

Please find more information in the LICENSE file.