Summary

k8s TTLAfterFinished feature cleans up finished jobs, and the absence of those jobs was making the AnsibleJob reconciliation loop create a new job to replace it, thus re-running the automation.
Resolves https://github.com/ansible/awx-resource-operator/issues/49

Details

For scenarios where the job fails, the "rescue" block re-runs the job in a new pod, and if that also fails, it will now set the isFinished status to true, which short-circuits the reconciliation loop on the next run, meaning that no other attempts are made. Example output from the AnsibleJob resource's status:

        "conditions": [
            {
                "lastProbeTime": "2022-02-18T00:13:30Z",
                "lastTransitionTime": "2022-02-18T00:13:30Z",
                "message": "Job has reached the specified backoff limit",
                "reason": "BackoffLimitExceeded",
                "status": "True",
                "type": "Failed"
            }
        ],
        "failed": 2,
        "startTime": "2022-02-18T00:12:47Z"
    }

For success scenarios, the isFinished status on the AnsibleJob resource is set to true along with the other statuses that denote a successful job run.

Extra Information

A new parameter called job_ttl can now be configured on the AnsibleJob spec to set the time-to-live for a job that is in the finished state.

For example:

---
apiVersion: tower.ansible.com/v1alpha1
kind: AnsibleJob
metadata:
  generateName: demo-job-1 # generate a unique suffix per 'kubectl create'
spec:
  tower_auth_secret: awxaccess
  job_template_name: Demo Job Template
  inventory: Demo Inventory # Inventory prompt on launch needs to be enabled
  runner_image: quay.io/chadams/awx-resource-runner
  job_ttl: 500

ansible / awx-resource-operator

Add isFinished status and do not retry finished jobs if the job is deleted #64

Summary

Details

Extra Information