Open TCMPK opened 1 year ago
This bug is still present, I've checked it with release v0.5.0 and with current commit 5e0d99269289a6c47b232d829eff424cfa374087.
Running deletion tasks on Delete() seems to be a key functionality according to the documentation. So either documentation or functionality needs to be fixed.
I have limited experience with Kubernetes controllers and Crossplane framework, please check if this makes sense.
This works on my dev setup. Delete()
is marking "Deletion" Reason, and we still haven't got an evaluation about if the playbook returns failed tasks or not, so we just rely on the one-shot Run()
execution. So we can take meta.WasDeleted()
and Deleting
Reason as marks (inside Observe()
) to know that the controller has done once the Delete()
run and then we mark a non-existent resource.
Another option is to move the mark "Deleting" inside Delete()
after the Run()
function and the corresponding error evaluation to get stuck to Delete() until c.runner.Run()
has been done without errors.
Please, check the proposal:
func (c *external) Observe(ctx context.Context, mg resource.Managed) (managed.ExternalObservation, error) {
cr, ok := mg.(*v1alpha1.AnsibleRun)
if !ok {
return managed.ExternalObservation{}, errors.New(errNotAnsibleRun)
}
/* set Deletion Policy to Orphan as we cannot observe the external resource.
So we won't wait for external resource deletion before attempting
to delete the managed resource */
/********** Don't go here in DeletionOrphan Policy **********/
// cr.SetDeletionPolicy(xpv1.DeletionOrphan)
/***************************************************************/
switch c.runner.GetAnsibleRunPolicy().Name {
case "ObserveAndDelete", "":
if c.runner.GetAnsibleRunPolicy().Name == "" {
ansible.SetPolicyRun(cr, "ObserveAndDelete")
}
if meta.WasDeleted(cr) {
/********** If it is marked for Deletion check if "Deleting" Reason is present **********/
/********** If "Deleting" reason is present it means Delete() has been run, **********/
/********** so we return non-existence of Resource **********/
cond_del := cr.Status.ResourceStatus.GetCondition(xpv1.Deleting().Type)
if cond_del.Reason == "Deleting" {
return managed.ExternalObservation{ResourceExists: false}, nil
}
return managed.ExternalObservation{ResourceExists: true}, nil
}
...
The solution doesn't behave well when the Ansible Run()
at Delete()
takes a long time. Some delete requests are enqueued with original conditions from the managed resource. Thus, it is not able to read "Deleting" reason at those enqueued delete requests.
So maybe a good solution is to manage the Delete()
call inside Observe()
itself and previously check if the resource is still on Kubernetes. So we check if it is marked for Delete, then we search it in Kubernetes, if found, delete it in Observe.
func (c *external) Observe(ctx context.Context, mg resource.Managed) (managed.ExternalObservation, error) {
cr, ok := mg.(*v1alpha1.AnsibleRun)
if !ok {
return managed.ExternalObservation{}, errors.New(errNotAnsibleRun)
}
cr.SetDeletionPolicy(xpv1.DeletionDelete)
....
switch c.runner.GetAnsibleRunPolicy().Name {
case "ObserveAndDelete", "":
if c.runner.GetAnsibleRunPolicy().Name == "" {
ansible.SetPolicyRun(cr, "ObserveAndDelete")
}
if meta.WasDeleted(cr) {
/* It is marked as delete and observe first if the resource is on Kubernetes */
observed := cr.DeepCopy()
if err := c.kube.Get(ctx, types.NamespacedName{
Namespace: observed.GetNamespace(),
Name: observed.GetName(),
}, observed); err != nil {
if kerrors.IsNotFound(err) {
return managed.ExternalObservation{ResourceExists: false}, nil
}
return managed.ExternalObservation{}, fmt.Errorf("%s: %w", errGetAnsibleRun, err)
}
/* If found, Delete it */
err := c.Delete(ctx, mg)
if err != nil {
return managed.ExternalObservation{}, err
}
return managed.ExternalObservation{}, nil
}
...
The
provider-ansible
provider does not invoke a finalizer or similar to cleanup created resources when a delete operation is triggered in the control plane (e.g.kubectl delete ansiblerun <run>
) This behavior is not reflected in the documentation.What problem are you facing?
When deleting a resource - here
AnsibleRun
the provider does not invoke an ansible-run with the „state=absent“ variable. The documentation and examples indicate that there should be an ansible-run to reflect the changes to the provisioned resources. (article and referenecd module) Because of this behavior only creation and update tasks are performed. This may be related to the following code fragment whichOrphans
(Link to source) the resource.How could Crossplane help solve your problem?
Implement the functionality or adjust the documentation that this is a planned or
wont-fix
behavior of the provider.