operator-framework / enhancements

Apache License 2.0
9 stars 40 forks source link

OLM V1 Milestone 1 #120

Closed awgreene closed 1 year ago

awgreene commented 1 year ago

Milestone 1

User Story

As a cluster admin, I can install and uninstall an extension through the Operator API from a list of available extensions on cluster hardcoded entities.

Description

Milestone 1 is all about getting the foundational pieces of OLMv1 in place and functional, but with minimal features. So the focus for milestone 1 will be installation and uninstallation of two operators that involves use of:

Scenario 1

Install the cockroachdb and prometheus operators from a hardcoded list of entities the operatorhub catalog by creating Operator objects for them.

Scenario 2

Create an Operator object for package name nonexistent and show that the Operator status contains the deppy error message about that package not existing in the catalog

Scenario 3

Create an Operator object for a package that rukpak does not support (e.g. has a webhook definition or does not support AllNamespaces install mode) and show that the Operator status contains the rukpak error message saying why the bundle installation failed.

Scenario 4

Delete the cockroachdb and prometheus operators from the cluster. Show that the underlying bundle contents are fully cleaned up.

(Bonus) Scenario 5

Use kubectl operator install and kubectl operator uninstall to create and delete the Operator objects described in scenarios 1-4

joelanford commented 1 year ago

I updated the milestone description to add Scenario 5 as a nice-to-have that can be worked in parallel with other efforts involved in this milestone.

bparees commented 1 year ago

scenario 1 indicates it's installing from the "operatorhub catalog" but nothing in this milestone seems to discuss how catalogs are installed/registered/extracted? should that be another explicit scenario?

varshaprasad96 commented 1 year ago

The following were achieved in the Milestone 1 of the OLM V1.

The following steps have been tested and the demo has been recorded with this commit of Operator Controller repository.

The pre-requisites for running this project are:

  1. Access to a Kubernetes cluster.
  2. Docker version - preferrably 17.03+.
  3. A commpatible version of kubectl installed to interact with the cluster.

In addition to the above, we also need:

RukPak: For installing and managing OLM Bundles and other artifacts on the cluster. Cert-manager: For creating and managing certificates needed for Rukpak's webhooks.

Fortunately, we have a Makefile target that helps us install a compatible version of Rukpak and Cert-manager.

Installing the Prometheus Operator:

Step 1:

Build and Deploy the Operator Controller image:

Run the make run command that:

  1. Builds and pushes an image for the operator controller.
  2. Installs a compatible version of cert-manager and Rukpak.
  3. Installs the required CRDs and manifests to run the Operator.

Note: Feel free to change the base image repository of the Operator Controller to the one that is accessible to you. It can be done by modifying this in Makefile. This is the image name of the operator controller and the remote registry to which it would be pushed to. To know more about configuring operator's image registry, refer here.

Step 2:

Apply the Custom Resource.

The operator which is to be installed can be specified in the packageName of the CR's spec. Here we will install the prometheus-operator.

apiVersion: operators.operatorframework.io/v1alpha1
kind: Operator
metadata:
  labels:
    app.kubernetes.io/name: operator
    app.kubernetes.io/instance: operator-sample
    app.kubernetes.io/part-of: operator-controller
    app.kubernetes.io/managed-by: kustomize
    app.kubernetes.io/created-by: operator-controller
  name: operator-sample
spec:
  # TODO(user): Add fields here
  packageName: prometheus
kubectl apply -f config/samples/operator_v1alpha1_operator.yaml

Step 3:

Verify if the Prometheus Operator has been installed.

Check if the prometheus operator is available and is running successfully:

➜  operator-controller git:(demo) k get pods -n prometheus-system
NAME                                   READY   STATUS    RESTARTS   AGE
prometheus-operator-65b7cf6fb4-chj4q   1/1     Running   0          33m

Check if the prometheus APIs are available for use:

➜  operator-controller git:(demo) k api-resources| grep -i prometheus
prometheuses                                   monitoring.coreos.com/v1                  true         Prometheus
prometheusrules                                monitoring.coreos.com/v1                  true         PrometheusRule

The Prometheus operator and its APIs were installed through OLM's Operator Controller and are available on cluster for use.

Since the prometheus-operator is installed by Operator Controller, deleting the prometheus-operator pod will trigger reconcile and it would be installed again.

Step 4:

Uninstall the operator by deleting the Custom Resource:

In order to uninstall the Prometheus operator, delete the CR:

➜  operator-controller git:(demo)  kubectl delete operators.operators.operatorframework.io operator-sample
operator.operators.operatorframework.io "operator-sample" deleted

Verify if the Prometheus Operator has been uninstalled:

All prometheus-operator pods must be terminated: The following should not show any pods:

➜  operator-controller git:(demo)  k get pods -A | grep -I prometheus
➜  operator-controller git:(demo) 

The prometheus APIs should no longer be available on cluster:

➜  operator-controller git:(demo)  k api-resources| grep -i prometheus
➜  operator-controller git:(demo) 

What does Operator Controller do currently?

  1. Watches the Operator resources and triggers reconcile if there are any changes.
  2. Based on the entity set provided to the deppy, we get a resolution set that contains the ones that need to be installed.
  3. Based on the solution set provided by Deppy (more work on this will be done in Milestone 2), utilizes Rukpak APIs to install bundles on cluster.

Demo:

asciicast

Thanks to all the contributors for helping us achieve Milestone 1! 🎉

joelanford commented 1 year ago

I didn't see anything related to scenario 2 or 3 in the demo. Does the code satisfy those as well? Or do we have a little more work to do?

bparees commented 1 year ago

i'm missing how the system goes from "packageName: prometheus" to finding a bundle to install. What's the source of available packages?

joelanford commented 1 year ago

Right now its hardcoded:

varshaprasad96 commented 1 year ago

I didn't see anything related to scenario 2 or 3 in the demo. Does the code satisfy those as well? Or do we have a little more work to do?

Both the scenarios Scenario 2 is satisfied by the controller for now. Will add some test cases to the demo. For Scenario 3, we need resurface the error from BundleDeployment's status.

i'm missing how the system goes from "packageName: prometheus" to finding a bundle to install. What's the source of available packages?

Yeah, as Joe mentioned its hardcoded for now. But we do have a working PoC of the same that works well with catalogsrcs. The next step is to merge that in.