Open siegfriedweber opened 2 years ago
Actually this is how the restart controller is supposed to work. The Superset operator creates a stateful set labeled with restarter.stackable.tech/enabled=true
. The restart controller of the commons operator watches this labeled stateful set and adds the UIDs and resource versions of the referenced config maps and secrets as annotations to the pod template. This triggers a restart of the stateful set right after its first creation. The commons operator watches these config maps and secrets and updates the annotations whenever these resources change which in turn triggers a restart. This ensures that Superset always runs with the latest configuration.
Even if the immediate restart is confusing, I do not see an elegant solution to prevent this, so I propose to close this issue.
If the outcome is "this is how it is supposed to work", then it might not be a bug, be we should still go back to the drawing board and look at what we've built, because this immediate restart is really something that shouldn't happen.
Maybe the title of the issue and also it's place in this repo is not correct anymore.
After talking to sigi, I'm moving it back to "Refinement: In Progress"
Taking a step back. Outcomes from the refinement discussion:
@soenkeliebau @lfrancke unclear how to proceed here, some input would be great. Do we leave things as they are here, or do we pursue one of the investigation threads?
This ticket can be closed in favour of these other investigation threads.
That the commons-operator is reconciling doesn't really say much, that'll happen whenever the STS is modified externally too. If you want to check whether it has triggered a restart then you'll want to check the STS YAMLs for whether the restarter annotations change.
I'd also argue that the restart controller isn't the cause of the problem here, just a symptom. The restart controller will trigger when the app's config changes. Check for why that happens if you want to avoid this.
(Case in point, if the restart controller is at fault then this will happen for all product operators, not just Superset...)
Thanks Teo.
The restart controller of the commons operator watches this labeled stateful set and adds the UIDs and resource versions of the referenced config maps and secrets as annotations to the pod template. This triggers a restart of the stateful set right after its first creation.
I'm a bit confused now, though. This sounds like the answer that you were looking for, no? I must be missing something still. I thought the restart happens because the restart controller changes the configmaps which in turn triggers a restart of the STS?
The restart controller just watches the configmaps for changes, and triggers STS restarts based on that. The restart controller is just a messenger for the actual issue (that the CM changed). It never modifies the CMs itself.
Understood, thank you. So, the described behavior has to come from somewhere else then.
@siegfriedweber @fhennig @vsupalov - sorry, back to you for now?
If you want to check whether it has triggered a restart then you'll want to check the STS YAMLs for whether the restarter annotations change.
The restarter annotations are initially added to the STS YAMLs by the restart controller. This triggers the restart.
The restart controller will trigger when the app's config changes.
The restart controller also triggers when a new StatefulSet with the annotation restarter.stackable.tech/enabled=true
was created.
(Case in point, if the restart controller is at fault then this will happen for all product operators, not just Superset...)
It happens for all StatefulSets which are annotated with restarter.stackable.tech/enabled=true
. That is, it also happens for the Airflow operator. The other operators do not use this annotation yet.
I think that I accurately described the issue in my first comment:
Actually this is how the restart controller is supposed to work. The Superset operator creates a stateful set labeled with
restarter.stackable.tech/enabled=true
. The restart controller of the commons operator watches this labeled stateful set and adds the UIDs and resource versions of the referenced config maps and secrets as annotations to the pod template. This triggers a restart of the stateful set right after its first creation. The commons operator watches these config maps and secrets and updates the annotations whenever these resources change which in turn triggers a restart. This ensures that Superset always runs with the latest configuration.
Please tell me if I am wrong.
@siegfriedweber and @teozkr will have a chat about this
Hm, the label "should" still be added quickly enough that we the pods shouldn't have been scheduled yet, but maybe my memory fails me on how much of a window we have, will have to do some testing tomorrow... If that is indeed the sad case then a fail-open (since we'd still need the controller as well, the worst that could happen would be falling back to the current extra restart) mutating webhook probably makes sense at some point, and should be completely transparent to the operators.
The downside is that doing a first webhook means doing a bunch of new scaffolding from scratch (certificates, kube API registration, etc)...
Ok, having tried it out now it does look like the first replica is restarted while all other replicas are created late enough that they are up to date. Avoiding that glitch will probably require us to do the webhook...
I have validated that a mutating webhook is able to avoid the initial restart for a simple STS... https://github.com/stackabletech/commons-operator/commit/0cbf72c90741fd618a9e97266f8fdd66c20ee544
Moving this ticket to commons since this is a bug in the restarter, not in superset-op specifically.
So, is the mutating webhook that accepted solution to this problem? I see that this ticket is in refinement acceptance, I'd like to know the up- and downsides of this solution. Is there another way? how bad is:
The downside is that doing a first webhook means doing a bunch of new scaffolding from scratch (certificates, kube API registration, etc)...
I personally wouldn't call this ticket refined, as I wouldn't know what to do. The ticket doesn't have tasks or even acceptance criteria.
I agree, this came up during the acceptance. So I'll move it back for now.
A few questions:
Do I understand it correctly that this will "only" fix the issue in most cases since delivery of webhooks is not reliable?
Yes, good old CAP strikes again. There's no way that we could possibly hook into an API in a way that is both:
We could technically sacrifice any of these, but (to me) 2 and 3 are both far more vital than 1.
Do webhooks require any kind of extra privileges that we currently don't need?
The operator and installer both need permission to work with MutatingWebhookConfiguration
objects. Additionally, webhooks do introduce the risk of bugs and/or misconfiguration wreaking havoc on the larger system.
This design would allow users who are uncomfortable with running our webhook to simply not deploy the MWC, at the cost of still having to deal with this extra restart.
Do webhooks work on OpenShift?
I haven't tested the spike against OS specifically, but Google seems to believe yes. I also remember noticing that certain parts of OS itself is implemented using webhooks.
Does this need to be presented to the team again? I assume no because I believe this is the outcome of the last team session, right?
It might be worth bringing up at tomorrow's arch meeting. I'll add it to the list.
Yes, good old CAP strikes again. There's no way that we could possibly hook into an API in a way that is both:
That's fine for me as the worst that can happen ist hat we get the extra restart, right? Similar to the answer to question two.
Thank you. If the team is fine with this then I am as well.
I think we didn't really discuss this broadly, I'll put it on the agenda for tomorrow, so we can have a brief look
That's fine for me as the worst that can happen ist hat we get the extra restart, right?
Yes.
After some discussion with @sbernauer and @siegfriedweber we decided to push on with the webhook approach.
So this means refinement is done?
I'd say so, yes. There's still the question of how we want to prioritize this, but I guess that's your department... :P
Teo did a handover with Siggi
Current behaviour
New restarter-enabled StatefulSet have their replica 0 restarted after the initial rollout is complete.
Expected Behaviour
The initial rollout should be completed "normally" with no extra restarts.
Why does this happen?
There is a race condition between Kubernetes' StatefulSet controller creating the first replica Pod and commons' StatefulSet restart controller adding the restart trigger labels. If the restarter loses the race then the first replica is created without the metadata, triggering a restart once it is added.
What can we do about it?
Add a mutating webhook (see the spike) that adds the relevant metadata. The webhook must not replace the existing controller, since webhook delivery is not reliable.
However, webhook delivery requires a bunch of extra infrastructure that we do not currently have, namely:
Definition of done
podTemplate
annotations as the controller currently doesSTS.metadata.generation
should stay1
)failurePolicy: Ignore
)failurePolicy
)Original ticket
The StatefulSet of a Superset cluster is immediately restarted after its creation. This should not be necessary and should be prevented. ```bash $ kubectl describe statefulset simple-superset-node-default ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal SuccessfulDelete 3m31s statefulset-controller delete Pod simple-superset-node-default-0 in StatefulSet simple-superset-node-default successful Normal SuccessfulCreate 2m59s (x2 over 3m31s) statefulset-controller create Pod simple-superset-node-default-0 in StatefulSet simple-superset-node-default successful ``` After the restart, the Superset pods are annotated as follows: ```yaml annotations: configmap.restarter.stackable.tech/simple-superset-node-default: cf60300e-0c45-4ee2-b60c-de53b0084182/21998 secret.restarter.stackable.tech/simple-superset-credentials: e0e4b781-46f9-44f4-80c8-a0876b91ed8b/16909 ``` This could be an indication that the restart controller of the commons-operator is involved. The commons-operator is busy while the StatefulSet is restarted: ``` 2022-09-16T10:02:50.228971Z INFO stackable_operator::logging::controller: Reconciled object controller.name="pod.restarter.commons.stackable.tech" object=Pod.v1./simple-superset-node-default-0.default 2022-09-16T10:02:50.236144Z INFO stackable_operator::logging::controller: Reconciled object controller.name="pod.restarter.commons.stackable.tech" object=Pod.v1./simple-superset-node-default-0.default 2022-09-16T10:02:50.239371Z INFO stackable_operator::logging::controller: Reconciled object controller.name="statefulset.restarter.commons.stackable.tech" object=StatefulSet.v1.apps/simple-superset-node-default.default 2022-09-16T10:02:50.255219Z INFO stackable_operator::logging::controller: Reconciled object controller.name="statefulset.restarter.commons.stackable.tech" object=StatefulSet.v1.apps/simple-superset-node-default.default 2022-09-16T10:02:50.258511Z INFO stackable_operator::logging::controller: Reconciled object controller.name="pod.restarter.commons.stackable.tech" object=Pod.v1./simple-superset-node-default-0.default 2022-09-16T10:02:50.266146Z INFO stackable_operator::logging::controller: Reconciled object controller.name="pod.restarter.commons.stackable.tech" object=Pod.v1./simple-superset-node-default-0.default 2022-09-16T10:02:50.274647Z INFO stackable_operator::logging::controller: Reconciled object controller.name="statefulset.restarter.commons.stackable.tech" object=StatefulSet.v1.apps/simple-superset-node-default.default 2022-09-16T10:02:51.433621Z INFO stackable_operator::logging::controller: Reconciled object controller.name="pod.restarter.commons.stackable.tech" object=Pod.v1./simple-superset-node-default-0.default 2022-09-16T10:02:51.449009Z INFO stackable_operator::logging::controller: Reconciled object controller.name="statefulset.restarter.commons.stackable.tech" object=StatefulSet.v1.apps/simple-superset-node-default.default ```