Closed kragniz closed 6 years ago
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: We suggest the following additional approver: wallrj
Assign the PR to them by writing /assign @wallrj
in a comment when ready.
The full list of commands accepted by this bot can be found here.
I'm not sure this is the right way to decide on seed nodes. I've a few questions, and a few concerns:
1) Is it likely a user would ever want to change a seed node, or perhaps would we ever want to? (e.g. during a maintenance event or upgrade) 2) How many seeds are usually run in a cluster? Is one per zone usual? 3) What if a user has a single-zone cluster, but wants more than one seed? (from what I understand, a minimum of one seed is required in order to interact with the cluster?) 4) How dynamic is changing seed nodes? Is it an expensive operation?
I'm concerned we are suddenly jumping to a new grouping mechanism that currently, no part of Navigator acknowledges (zone). We do need to introduce a grouping mechanism, but I rather hoped it could be based on labels, of which could be arbitrarily defined by the user.
These failure-domain annotations are quite standard in cloud environments, but not so much in clusters without them. I have a feeling our e2e testing environment (minikube) for example, does not set these parameters. We can't restrict users to requiring certain labels are set on the nodes that pods run on.
In terms of moving this PR forward, I think changing it to simply make the first pilot in a node pool be a seed would suffice. It only addresses point (2) & (3) above, but it does remove that binding to zones/fixed labels.
We do need to work out how this is properly expressed though, but I think it's a more complex problem than this (we need to allow it to be highly configurable by the user, but still validated by Navigator to ensure the desired configuration is valid).
Is it likely a user would ever want to change a seed node, or perhaps would we ever want to? (e.g. during a maintenance event or upgrade)
I can't think of a good reason to manually change the seed nodes.
How many seeds are usually run in a cluster? Is one per zone usual?
A small number per zone, 1-3. Netflix runs 1 per AWS zone, others recommend 2 or 3.
What if a user has a single-zone cluster, but wants more than one seed? (from what I understand, a minimum of one seed is required in order to interact with the cluster?)
We could add an option for number of seeds per zone? A minimum of one live seed is required to add new nodes to the cluster.
How dynamic is changing seed nodes? Is it an expensive operation?
Very cheap, any node can become a seed node at any point. Gossip traffic will simply be directed at other nodes.
I like the idea of navigator working on cloud providers correctly by default, so I'm keen to implement something that doesn't require a lot of thought from an end user. How would you feel if the node labels were configurable, with the current labels as defaults?
I made an alternative pr for selecting the first pod in a nodepool as a seed here: #254
@kragniz PR needs rebase
@kragniz: The following test failed, say /retest
to rerun them all:
Test name | Commit | Details | Rerun command |
---|---|---|---|
navigator-quick-verify | 45cb65b82f55f366d9065e2e2f57431b15977449 | link | /test verify |
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.
Closing in favour of #254
This labels a single cassandra node per failure zone as a seed node, allowing a seed provider to pick up this information (via directly querying the endpoints, or via srv records) and pass it to kubernetes.
This is currently based off the node
failure-domain.beta.kubernetes.io/zone
label, but could be made configurable in the future (probably the same as #247)/area cassandra