Closed sleiner closed 2 years ago
Awesome! Thank you! Looks like the issue now is the molecule cache / destroy step which only seems to fail if there isn't a cache?
https://github.com/techno-tim/k3s-ansible/pull/48#issuecomment-1237570983
@timothystewart6
Looks like the issue now is the molecule cache / destroy step which only seems to fail if there isn't a cache?
I don't see that - can you explain?
The current last job fails with an error message I have not seen before:
failed: [control1] (item=controller) => {"ansible_loop_var": "item", "changed": false, "cmd": ["k3s", "kubectl", "wait", "deployment", "--namespace=metallb-system", "controller", "--for", "condition=Available=True", "--timeout=60s"], "delta": "0:00:01.804541", "end": "2022-09-06 10:07:00.859658", "item": {"condition": "--for condition=Available=True", "description": "controller", "name": "controller", "resource": "deployment"}, "msg": "non-zero return code", "rc": 1, "start": "2022-09-06 10:06:59.055117", "stderr": "error: the server doesn't have a resource type \"deployment\"", "stderr_lines": ["error: the server doesn't have a resource type \"deployment\""], "stdout": "", "stdout_lines": []}
I just retried the job in my fork and it succeeded for the same commit, so it appears that we have another flakiness. Unfortunately, I cannot reproduce it locally. I'll just retry the job a few times to check how often this becomes a problem.
After that rather mysterious failure, I have let the latest revision (1a4346f1b8cdada7575447d5f67038a87ebc2622) run 11 more times using these retrigger commits. These are the results:
So my proposal is doubling the retry delay for the MetalLB CR application and otherwise going with the changes as they are currently. It is already a substantial improvement over the current state and if flakiness becomes a significant issue again, we will tackle it then. What do you think, @timothystewart6?
Thank you very much!
Proposed Changes
post
role anyway, I deduplicated some logic in it (side bonus: this makes the Ansible logs much more compact) and fixed a typo :-)Checklist
site.yml
playbookreset.yml
playbook