DoodleScheduling / keycloak-controller

Keycloak realm reconciliation for kubernetes
Apache License 2.0
5 stars 0 forks source link

fix: runtime and test race conditions #243

Closed raffis closed 2 months ago

raffis commented 2 months ago

Current situation

Unit tests fail at random. This due the fact that the controller code is not that robustly implemented. After the reconcile it stores the state in .Status and at the same time another reconcile loop is triggered from with a stale resource from the client cache because it was triggered from the delete pod event which was issued before the status updates.

Proposal

Refactor code to be more robust in general. Meaning refactor into different phases which each will exit the reconciliation and start from the beginning if required. This includes:

  1. Unlink stale reconciler from the realm status
  2. Delete reconciler pod if it is stale
  3. Garbage collect reconciler pod
  4. Skip reconciliation if last sucessful reconciliation is before spec.interval + now()
  5. Handle container status of a reconciler
  6. Create a new reconciler pod