projectsyn / lieutenant-operator

The Project Syn Inventory API Operator
https://docs.syn.tools/lieutenant-operator/
BSD 3-Clause "New" or "Revised" License
3 stars 1 forks source link

Rethink Reconcile Handling #80

Closed Kidswiss closed 3 years ago

Kidswiss commented 4 years ago

At first the reconcile functions were rather simple and the functionality was added directly there.

But we're now at a point where this doesn't scale very well. So I'd like to introduce some more structure to the reconcile functions. We have to to things in certain orders, as they could block or affect other steps during the reconcile. All that logic currently resides inside the respective reconcile loops. This makes it very hard to introduce new functionality that has to be done in certain orders. Also there's a lot of repetition as each reconcile has to do various steps that are common for all the reconcile loops (adding certain labels, or writing back the manipulated CR to the API, etc.).

With some inspiration from https://crossplane.io/docs/master/contributing/services_developer_guide.html I'd like to propose some changes:

We'll define an interface(s) that exposes functions for:

These functions are roughly in two categories: determine the state and apply the state and may be split from each other.

Then we'd need some controller, that will go through these functions and determine what actions have to be taken. This sounds like something that could easily be modelled by a FSM (finite state machine). Where it will transition through the various possible states until it reaches the final state where everything is in sync.

This has various benefits:

Kidswiss commented 4 years ago

After a detailed call with @corvus-ch my idea to use a full blown FSM may be overkill. His suggestions:

We can use a weighted list, where all the various, small steps are added. Each of these steps has a single responsibility, like addLabel, CreateGit, RemoveSecrets, etc.

In this new design, the reconcile function only fetches the necessary information from the cluster to create a state. The list of steps will be filtered according to the state, to get a smaller list of steps that need to be executed. This makes it very easy to extend the reconcile logic, as well as adding the steps at the right point.

Kidswiss commented 4 years ago

Interesting article by redhat, on best practices for reconciling: https://www.openshift.com/blog/kubernetes-operators-best-practices

Kidswiss commented 4 years ago

I created a POC implementing the idea discussed: https://github.com/Kidswiss/execution-engine

With that it's easy to add more functionality to a reconcile flow, especially for logic, that should be added to all the CRDs.

srueg commented 4 years ago

As discussed, we should probably make sure each reconcile loop only updates it's own CRs: The cluster_reconcile should only update Cluster objects and not tenants for example. The tenant reconcile should find a list of clusters for each tenant and update the tenant object accordingly.

This should solve the conflicts we are experiencing currently.

Kidswiss commented 4 years ago

From the last call about this:

Kidswiss commented 4 years ago

The refactoring is completed.

The various CRDs now fetch the necessary information now instead of being pushed. That got rid of all the race conditions we observed before. One issue though: the cluster files in the tenant repository will only be applied on the next tenant reconcile.

If the cluster controller sets the right owner by itself, it should be possible to accelerate that. Then the cluster will be observable as a secondary resource.

srueg commented 4 years ago

We have an issue to implement according ownerReferences: #45