Closed dannyzaken closed 5 years ago
- While testing noobaa provisioner, we noticed that when the provisioner process crashes or when it's down during Delete, the OBC is deleted, but the OB is not. After restarting the provisioner the Delete function is not called again
It's a known issue/stretch feature, see #52. There's been a lot of discussion on the solution between @jeffvance and myself and Rook/Crossplane folks. The same problem exists between core k8s and external file/block provisioners. The solution implemented there is to have the core k8s volume controller scan PVs, select those that were dynamically provisioned, and check if it has an associated PVC. If not, it changes the phase of the PV to released
. The update event is detected by the provisioner controller and the PV is reconciled. The provisioner should recognize the released
phase change and act accordingly. It's a fairly simple solution.
The benefit to file/block is that core k8s supports them, so a separate controller runtime is baked into every k8s cluster. The library would be requiring provisioners to run a second pod to perform these scans and updates. We didn't think that was a great approach.
The alternative is to implement this controller in the library, secondary to the OBC controller. It would scan OBs, select the ones that match it's provisioner, check if there's an OBC, and if not, call Delete/Revoke dependent on it's provisioning method. It's important that the controller make Get calls against the Indexer or Store instead of the API server for optimization.
That's where @jeffvance and I left off anyhow. We're completely open to discussing other solutions and implementations.
-shouldn't a finalizer be added to the OBC to protect it?
I think the answer is no. Doing so means we lock users out of the objects they created themselves. Finalizers are better suited to protecting system generated objects that the system is expected to clean up.
Not so sure what is the problem with setting a finalizer. PVC's also have it:
finalizers:
- kubernetes.io/pvc-protection
I would simply set a finalizer that will allow the provisioner to cleanup all the resources (bucket, account, etc) that it created, and if there are no errors then it will be instantaneous, if there are failures, it will keep the OBC in terminating state to make sure that cleanup is not forgotten.
shouldn't a finalizer be added to the OBC to protect it?
Philosophically, we didn't want the lib to add a finalizer to a user-created resource but @guymguym makes a good point above.
@jeffvance Lets put philosophy aside :) 👨🎨
We need the provisioner to reconcile OBC deletions consistently. It can do that in an eventual-consistent way, and have delays in its cleanups, but it doesn't seem valid to just forget about whole buckets with potentially a lot of storage allocated to them, and leave that data remain after the user requested to delete it.
A scanning solution will work, but will not scale, and a finalizer is essentially an "index" to quickly find which OBC's need to be scanned, instead of just running on all OBCs every time we scan.
So I think that a finalizer is a well known paradigm that will solve the issue easily. If it is discouraged for user-created resource - do you have references to that, and what is the suggested alternative?
Thanks!
I am liking the suggestion to add a finalizer to the OBC following the k8s PVC pattern (code)
Per our Aug 8th meeting - the consensus is that this is a reasonable addition for ensuring that the user facing API and back-end states keep in sync.
Once we add a finalizer to OBC we can remove the finalizer from the Secret and ConfigMap to let them get garbage collected automatically using their owner reference to the OBC.
@guymguym We should decide what triggers the cascading delete. In our initial design the deletion of the OBC is the triggering event. In the situation above it appears that the deleting of the CM or Secret would trigger the cascading delete. Is that what you want?
I still think it would be unexpected for the user to delete a secret they didn't create and that would cause the CM, OB, OBC and provisioned bucket to all be deleted.
OBC is the trigger. By setting the secret and cm owner reference to point to obc we get exactly that.
What would happen if the user deletes the secret (only)?
Kubernetes will simply allow to delete it, and the reference to OBC will be removed by it. If we choose we can reconcile and recreate it by asking the provisioner to get the secret keys, which I think makes sense.
Our goal was to reduce the chances of inadvertently deleting the secret and CM (which would be an inconsistent state). What is the value of not have a finalizer on the secret and CM? It is interesting to consider the case where a new secret is needed and is replaced upon deletion.
Can you find references for using finalizers to protect from accidental deletion? Afaik finalizers are a form of destructors to handle the cleanup of external resources.
They definitely server that purpose ( handle the cleanup of external resources). I still think that, unless we want to handle cases of secrets and CMs being deleted while the OB and OBC remain, we should make an effort to help keep the coordination of these resources consistent. IOW, it is reasonable to expect that if I have an OBC then I also have the associated Secret, CM and OB. Something other than that would be an anomaly.
I am definitely in favor of adding a finalizer to the OBC. I am unsure if we want to remove the finalizers from the CM and Secret. @guymguym do we have the ownerReference relationship right? There was an offline suggestion that the OBC's ownerReferences should point to the secret and CM rather than the Secret and CM's ownerRef pointing to the obc.
The direction is correct - Secret, CM , OB are the dependant objects and should reference to their owner, the OBC.
I use this when creating all the resources that are dependant on the NooBaa CR with owner reference, and later when the NooBaa CR is deleted, kubernetes garbage collects every dependant resource (Pods, StatefulSets, Secrets, Services). This is the easiest way to cleanup, and this is why I think we should avoid finalizers for those dependant resources.
The confusion might arise from being used to programming languages with garbage collection in which the referring object is actually the owner, while in the kubernetes model the referring object is the owned one.
The first paragraph here will clarify it:
Some Kubernetes objects are owners of other objects. For example, a ReplicaSet is the owner of a set of Pods. The owned objects are called dependents of the owner object. Every dependent object has a metadata.ownerReferences field that points to the owning object.
In order to keep OBC, Secret, CM, OB in sync, we should simply reconcile the OBC to recreate them if they are modified/deleted. This is how other controllers do. For example - when you delete a Pod which is part of a deployment, it gets deleted, and the deployment controller will recreate it. This is also very useful - to remove faulty states of the pod. I don't see any difference here. If a user decides to delete one of the dependant objects we should simply recreate them. We can choose how aggressive we are in overriding user changes to them.
IMHO for most cases only deletion should be handled, and modifications are usually manual intervention which should not be overridden. This allows good balance of user vs. controller control so the user can override values for testing or troubleshooting, and when needing to revert it one can simply delete the modified objects and the controller will recreate.
I'm starting to agree with you! :smile:
I agree we should remove the finalizers from child resources and just reconciling accidental deletion. There are mechanisms within k8s that allow for acting on events of child resources, but queuing the parent for reconciling. They're abstracted nicely in the operator framework - which we don't use - but they should still be doable. This would save having to implement 2 more controllers for configmaps and secrets.
I wish I could find the docs that describe the odd behavior that crops up when a finalizer is set on an object that also has an owner reference but I remember reading it at some point. That alone is reason enough remove them.
fixed by #139
Delete
, the OBC is deleted, but the OB is not. After restarting the provisioner theDelete
function is not called again