Closed lorenz closed 1 year ago
So this happens in the following scenario, right?
I.e. the bug is that hostsfile might squash a perfectly valid /etc/hosts / CD on startup without taking into account whatever might have been already there, and that causes the entire node to perhaps not be able to connect to the cluster ever again?
Yes, that's my understanding.
The main update loop has a changed variable, which is set to true if either a local address change or a cluster change happened. Problem is if there is no curator or the curator has not been contacted yet, the
nodes
variable in the runnable does not contain any non-local nodes. Thus if the local address is updated, the hostsfile service writes a cluster directory with only the local node to disk, rendering the node unbootable without intervention.