opdev / l5-operator-demo

Showcase level 5 operator capabilities
Apache License 2.0
6 stars 12 forks source link

L5 Demo Operator 🏗️

The projects goal is to develop a demo operator with level 5 capabilities to serve as an example to enhance workshop (internal) as well as to present at kubecon or a similar conference for which a proposal was submitted. One version of the (presentation) was accepted to (cloud native rejects). The capabilities are being developed according to our interpretation of the requirements described by the operator capability descriptions given in the operator sdk which are currently being evolved working in progress

Current Capability descriptions

The l5 operator is an example of a minimal implementation of the 5 capability levels described by the operator framework

Level 1: Basic Install

The L5 operator is able to take advantage of the Operator Lifecycle Manager and is able to deployed with one click. Once deployed it is able to set up its operator upon creation of a custom resource. User can install the operand by creating and configuring the CR. When configuration changes are applied, the app version is reported in the status and the operand updates in a non-distruptive fashion when the configuration of the CR is changed.

Level 2: Seamless Upgrades

It's possible to rollout both operator and operand updates seamlessly with some caveats. Liveness and readiness probes come free. The operator is able to update the operand via the CR field. By doing so, we are able to control the app version and report the status.

Level 3: Full Lifecycle

Backup and restore functionality is provided via the crunchy postgres operator which we consume in order to get a database-as-a-service right within our cluster. The L5 operator contains a liveness and readiness probe, so if the connection to the database fails, it will wait for reconfiguration work to be finished. The L5 operator utilizes the rolling deployment strategy.

Level 4: Deep Insights

The operator as well as operand expose metrics. These are aggregated using Prometheus and visualized using Grafana. The operator exposes health metrics endpoints such as app latency, request per second, and http codes. Currently we are working on implementing the RED method which defines three key metrics for any service: rate - the number of requests per second, errors - the number of those requests that are failing, and duration - the amount of time those requests take.

Level 5: Autopilot

The operator is able to autoscale by automatically provisioning a horizontal pod autoscaler that automatically changes the size of the deployment based on application load. The operator is able to scale the pods accordingly within the range of the size and maxReplicas we declare.

Operator Installation

The operator is published as a community operator on the openshift operator hub.

Different ways to run the Operator using the Operator SDK

Prerequisites

1. Run locally outside the cluster

git clone https://github.com/opdev/l5-operator-demo
cd <project>
make generate
make manifests
make install
make run
oc apply -f config/samples/pets_v1_bestie.yaml

2. Run as a Deployment inside the cluster

git clone https://github.com/opdev/l5-operator-demo
cd <project>
make generate
make manifests

3. Deploy the Operator with OLM

operator-sdk run bundle <operator-bundle-image>

High Level Diagrams

An editable version of this diagram is on google drive (internal)

Deployment Diagram

"Traditional" Architecture

Traditional Deployment

What it looks like in Kubernetes

Deployment Diagram