buehler / dotnet-operator-sdk

KubeOps is a kubernetes operator sdk in dotnet. Strongly inspired by kubebuilder.
https://buehler.github.io/dotnet-operator-sdk/
Apache License 2.0
237 stars 64 forks source link

[feature]: Support reconciliation on status updates #668

Closed PSanetra closed 11 months ago

PSanetra commented 11 months ago

Is your feature request related to a problem? Please describe.

At the moment it is not possible to reconcile on status updates as the ResourceWatcher implementation skips events where the generation is not incremented.

Related code: https://github.com/buehler/dotnet-operator-sdk/blob/e3e88512e6e98b4a6b206300cee87a407aa6ec4b/src/KubeOps.Operator/Watcher/ResourceWatcher%7BTEntity%7D.cs#L212-L219

Describe the solution you would like

I would suggest to add a configuration option to support reconciliation on any event.

Additional Context

No response

buehler commented 11 months ago

Hey @PSanetra

The thing is that status update "should not" trigger reconciliation as far as I understand it. That's why the generation is not updated. It may lead to errors and multiple reconciliations that run simultaneously.

Do you think it is required to have reconciliation on status updates?

PSanetra commented 11 months ago

@buehler I think we have a use case where it makes sense to react to status updates.

Is there a specification how a operator should behave? (Asking as your wording looks like a citation of a specification.)

We can change our CRD to move the status fields into the spec, but it feels more like a hack.

buehler commented 11 months ago

I don't know if it is an official spec, but it is my general understanding of the matter. Also, I found definitions by red hat (best practicers, "managing status" part) that state:

Under normal circumstances, If we were updating our resource every time we execute the reconcile cycle, this would trigger an update event which in turn would trigger a reconcile cycle in an endless loop.

For this reason, Status should be modeled as a subresource as explained here.

This way when we can update the status of our resources without increasing the ResourceGeneration metadata field.

As I understand this, status updates should not trigger a reconcile loop. To achieve the behaviour that you seek, you can periodically check the resources and requeue them with an injected EntityRequeue<TEntity> delegate.

There is an example documented at https://buehler.github.io/dotnet-operator-sdk/docs/api/KubeOps.Abstractions.Queue.EntityRequeue-1.html.

With the requeue mechanic, you can periodically check an entity and the status of an entity. And, if you have multiple things that manipulate your resources in the Kubernetse API, you may have a split brain problem.

wdyt?

PSanetra commented 11 months ago

ok, with that explanation it should be ok, I guess. Is it possible to add documentation somewhere regarding this behavior? (I am not sure where.)

buehler commented 11 months ago

The idea of it is documented here: https://buehler.github.io/dotnet-operator-sdk/src/KubeOps.Operator/README.html#controller

But I agree, this fact can be made more clear.

I'll update it in the documentation.