Open trondhindenes opened 5 years ago
also probably worth mentioning that if I do a "regular" kops rolling-upgrade that replaces master nodes, we're not seing the problem with left-behind master ip addresses. It only happens if we do etcd-manager restore.
I've added some documentation in #251 on how to solve this issue.
Basically, if a master doesn't get deleted normally, the IP will stick around /registry/masterleases
in etcd. This is more of a kubernetes thing than a kops or etcd-manager thing IMO. We could add a restore step that fixes this automatically, but it seems not exactly right to have a manager edit data. Might be nice to have it in kops though (kops restore
)?
Interested to hear what others think.
We're testing out a procedure for a full master refresh using kops/etcd-manager (described here: https://hindenes.com/2019-08-09-Kops-Restore/). In short, we wipe the masters, let kops set up new masters, and use
etcd-manager-ctl
to restore the last known backup. This seems to work very well.However, we're noticing that in-cluster apps that need access to the Kubernetes api sometimes fail. This seems to be caused by the fact that old (deleted) masters are still present in the
kubernetes
endpoint (kubectl -n default get endpoints kubernetes -o=yaml
).This is probably not a etcd-manager problem at all, but I'm at a loss regarding how to get rid of references to old (non-existing) masters, so any pointers would be deeply appreciated.