To workaround:
1) Changed the operator install to do it one at a time and wait for each operator
2) Created a fixup-olm.sh script that:
a) Restarts the OLM catalog pod of the main OLM pod has restarted
b) Deletes any subscriptions and recreates if they get in a bad state
c) Deletes any orphaned CSVs (that don't have a referencing subscription)
3) Called the fixup-olm.sh script while installing operators
4) Called the fixup-olm.sh script while installing Nav
For the failure I noticed there were general problems on the cluster with 1 of the 3 nodes not working. I think a cluster needs a minimum of 3 working nodes to function so I created a 4th node, left the dodgy node in place. The cluster then looked more healthy so I re-ran 1-click manually and it worked:
To workaround: 1) Changed the operator install to do it one at a time and wait for each operator 2) Created a fixup-olm.sh script that: a) Restarts the OLM catalog pod of the main OLM pod has restarted b) Deletes any subscriptions and recreates if they get in a bad state c) Deletes any orphaned CSVs (that don't have a referencing subscription) 3) Called the fixup-olm.sh script while installing operators 4) Called the fixup-olm.sh script while installing Nav
This test succeeded for 4/5 runs:
For the failure I noticed there were general problems on the cluster with 1 of the 3 nodes not working. I think a cluster needs a minimum of 3 working nodes to function so I created a 4th node, left the dodgy node in place. The cluster then looked more healthy so I re-ran 1-click manually and it worked: