Closed sebastianrosch closed 4 years ago
Hi @sebastianroesch. Thanks for your PR.
I'm waiting for a pusher member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test
on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.
Once the patch is verified, the new status will be reflected by the ok-to-test
label.
I understand the commands that are listed here.
Hey @sebastianroesch, I've just had a quick skim of this and I think there's probably some misunderstanding on finalizers here.
You shouldn't be setting a finalizer on the child resources (Deployments, services etc). Finalizers are intended for controllers to handle special clean up logic, not to just protect something from being deleted.
With the current implementation, when we delete the GitTrack
, we will leave all child resources with a finalizer with no controller to remove them. The GC will see the GTO
has been deleted and then mark all children for deletion, but sit in an infinite loop because of the finalizer. This is not a desirable state to be in as resource updates are blocked when they are marked for deletion except for finalizers.
What we should be doing for this is instead:
GTO
when they are created or updatedGTO
controller Reconcile if the GTO .metadata.deletionTimestamp
is not nil (this means it has been deleted by the GC or otherwise)GTO
and the GC will remove it from ETCD, and in theory, because there's no owner ref on the child, should not delete the childPlease see my descriptions of this design in #101, I went into more detail there
Thanks for the feedback, that makes sense. I think I was too focused on protecting the child resources from being deleted and didn't trust the owner references anymore after our previous issues.
One thing I'm not sure of is how we would be able to distinguish the deletion of a resource from the GitOps repo and the loss of the GitTrack
. In both cases, the deletionTimestamp
would be set on the GitTrackObject
, so on Reconcile the GitTrackObject
wouldn't know whether to keep the child resource (loss of GitTrack
case) or remove it (deletion of resource in GitOps repo).
One thing I'm not sure of is how we would be able to distinguish the deletion of a resource from the GitOps repo and the loss of the GitTrack. In both cases, the deletionTimestamp would be set on the GitTrackObject, so on Reconcile the GitTrackObject wouldn't know whether to keep the child resource (loss of GitTrack case) or remove it (deletion of resource in GitOps repo).
For the first pass, I wouldn't distinguish. We can get a PR ready with this feature that just keeps all resources around on any deletion event. Once we are happy with the "protection" mechanism, we could then create a PR into the feature branch with the should we keep or should we delete logic as described in https://github.com/pusher/faros/issues/101#issuecomment-474290027, does that seem like a reasonable way to tackle this?
@sebastianroesch: PR needs rebase.
@JoelSpeed I would like your feedback on this:
To prevent accidental deletion of resources due to etcd issues and k8s garbage collection, as discussed in #101, I added finalizers to the child resources. This is not the full implementation as discussed, and this behavior is currently not configurable. I would like your feedback on whether this is a good way forward.
We have tested this in different scenarios, most specifically in the etcd leader loss that lead to an outage as described in #143.
Since we still have to delete resources when removed from the GitOps repo, Faros also needs to ability to remove finalizers. But as the
GitTrack
controller, which knows about the leftover resources, does not know the child resources, it cannot remove the finalizers. Therefore, I moved deleting theGitTrackObject
into theGitTrackObject
controller so that it can remove the finalizer before being deleted. TheGitTrackObject
is only marked for deletion via an annotation.I hope this makes sense. Thanks!