coreos / tectonic-installer

Install a Kubernetes cluster the CoreOS Tectonic Way: HA, self-hosted, RBAC, etcd Operator, and more
Apache License 2.0
601 stars 266 forks source link

Make kubelet node labels a key=value pair #1439

Open jescarri opened 7 years ago

jescarri commented 7 years ago

FEATURE REQUEST

Why is un-usable: There isn't an easy way to filter controller nodes vs worker nodes, we need to scrape etcd that lives in the same nodes as the control plane.

On tectonic 1.5.x there was a master / worker labels set to true, and made k8s sd super easy.

Versions

What you expected to happen?

Being able to easily filter the worker nodes and controllers.

How to reproduce it (as minimally and precisely as possible)?

Deploy a prometheus, using the k8s node SD try to filter only controllers or only workers.

jescarri commented 7 years ago

@joshrosso

joshrosso commented 7 years ago

xref: https://github.com/coreos/tectonic-installer/issues/667

Somewhat related, could give us more flexibility in node labeling via the installer and, by side-effect, provide a workaround until we solve this 'the right way'.

s-urbaniak commented 7 years ago

/cc @brancz

brancz commented 7 years ago

Yes as I was troubleshooting this yesterday, I looked at Prometheus code and realized that empty label value is basically equivalent to no label. So this is indeed the kind of change we need.

robszumski commented 7 years ago

@brancz playing devils advocate for a sec, why not change Prometheus, as a value-less label query is a valid thing in Kubernetes.

brancz commented 7 years ago

To be honest, I'm not sure how much effort it would be upstream. In my opinion we should support it, just a question of how breaking this is and how much effort.

luxas commented 6 years ago

This would be quite painful to change at this point in k8s... and I guess there are other uses for valueless labels in Prom as well

dghubble commented 6 years ago

@brancz what was the conclusion on the Prometheus side? In my testing, Prometheus 2.1 doesn't distinguish between non-existent key-value pairs and pairs with an "" value (need to check Prom 2.2).

Given node-role.kubernetes.io/master="" can't be undone, I'm considering of adding another label (node-role.kubernetes.io/controller="true") to masters (i.e. controllers) in the Typhoon distro to unblock discovery work. Label not final. https://github.com/poseidon/typhoon/pull/160

brancz commented 6 years ago

Changing this behavior would be too breaking, so it's not going to be changed. Any label that has a value would work, your suggestion sounds good to me.

brancz commented 6 years ago

Ultimately however, I don't believe etcd should be monitored by relabeling (relatively random) node information, best would be if etcd were self hosted or at least static pods, that way the pods could be discovered like any other workload on Kubernetes.

dghubble commented 6 years ago

Yeah, agreed, it would be ideal if the cluster's etcd was self-hosted and discovered as a Kubernetes service. Unfortunately, for now, that work stalled out due to difficulties and bootkube isn't targeting it anymore, so I'm guessing it will be a long time before that's viable.

brancz commented 6 years ago

Yes, I just wanted to make sure everyone is aware that those things are workarounds at best. Any reason why static pods for etcd are not viable (you could have a node pool specifically for etcd)?

Quentin-M commented 6 years ago

X-ref: https://github.com/prometheus/prometheus/issues/3726#issuecomment-365337866

This is still a pain point for resource planning, especially if we don't want to add random labels that carry no purpose besides working around a monitoring limitation.

discordianfish commented 6 years ago

Unfortunately, for now, that work stalled out due to difficulties and bootkube isn't targeting it anymore, so I'm guessing it will be a long time before that's viable.

Oh? I thought everyone moves towards self-hosted?

Either way, would make sense to have a role label with the role as key either way. But the current labels are pretty much 'standardized' and I guess changing them would break lot of things. But adding a new label would make sense I guess.

dghubble commented 6 years ago

Just so there is no misunderstanding (for readers who stumble on this without context), bootkube does self-host the control plane and others are aiming to do the same. Self-hosting etcd itself was a separate and much harder challenge that bootkube is no longer working on - you bring your own etcd when using bootkube.