Open jawnsy opened 1 year ago
Hi @jawnsy, thank you for opening your first issue on the Consul on Kubernetes repo. Thank you also for the detail in the issue. I think this is a very sensible feature request.
Because the solution will likely be implemented at a Consul level and not a Consul on Kubernetes level, I am going to transfer the issue to the Consul repository so they can have a look at it.
Community Note
Is your feature request related to a problem? Please describe.
When saving a snapshot (
consul snapshot save /tmp/abc.snap
) and restoring the snapshot into a different Kubernetes cluster (consul snapshot restore /tmp/abc.snap
), consul has a catalog entry for dead nodes, despite active members being updated appropriately.For example, I have some nodes in a 10.62.xx subnet:
The catalog shows these nodes:
However, the member list does not (because we moved the consul installation from a Kubernetes cluster running in the 10.62 subnet to one in the 10.12 subnet):
The solution to the above error messages is to manually deregister nodes from the catalog, which also only appears possible using the REST API and not through the
consul
command:After the deregistration is complete, the new nodes appear in the catalog and the log messages stop:
Feature Description
Essentially, the problem is that the snapshots seem to contain the node list, and restoring the snapshot in a different cluster results in some error messages appearing.
There may be some different approaches to solve this, such as:
consul
CLI command to deregister nodes from the catalog (this makes things a little more convenient than runningcurl
commands, but is really just making the workaround easierUse Case(s)
Anyone moving a Consul installation (this one backs a Vault installation) or using the snapshot save/restore capability for backups would be affected by this problem.
Contributions
No