When running boatswain on a cluster, I've found that SetNodeRoles can fail with
[ ... snip regular operation ...]
Set node-role.kubernetes.io labels on new node
Identified role worker for node ip-10-200-7-89.us-west-2.compute.internal
node-role.kubernetes.io/worker
panic: Operation cannot be fulfilled on nodes "ip-10-200-7-89.us-west-2.compute.internal": the object has been modified; please apply your changes to the latest version and try again [recovered]
panic: Operation cannot be fulfilled on nodes "ip-10-200-7-89.us-west-2.compute.internal": the object has been modified; please apply your changes to the latest version and try again
[... snip panic ...]
Steps to Reproduce the Problem
I'm not sure how to reproduce the problem yet
Actual Behavior
Boatswain fully exited leaving the cluster in a half-maintained state. I had to drain and terminate the instance that was being replaced at the time of the failure manually.
Expected Behavior
Boatswain should retry or ignore this non-critical failure.
When running boatswain on a cluster, I've found that SetNodeRoles can fail with
Steps to Reproduce the Problem
I'm not sure how to reproduce the problem yet
Actual Behavior
Boatswain fully exited leaving the cluster in a half-maintained state. I had to drain and terminate the instance that was being replaced at the time of the failure manually.
Expected Behavior
Boatswain should retry or ignore this non-critical failure.