Closed rossf7 closed 8 months ago
I think I can help
@dipankardas011 Thank you! My initial thought on this was to replace the list var of worker nodes here with a map that includes the labels. WDYT?
We would need to use node selectors for Prometheus, Flux and any other components so they run on the internal node.
Another approach is to add a taint to the falco node and ask the falco team to add a toleration in https://github.com/falcosecurity/cncf-green-review-testing
cc @nikimanoledaki @AntonioDiTuri
yes we can do using node label selection or taints and tolerations
Actually I was trying out a specific problem related to this wherein I wanted t oschedule the pod to only controlplane nodes given traints and tolerations
may be you can tell if it helps in this issue
@dipankardas011 That's a nice approach and in future we may want to run some workloads on our control plane node if we start to max out the "system" node where we run Prometheus, Flux etc.
A downside I see with a taint per project is we need to run Kepler on all nodes. So we'd need to add tolerations for all the taints.
How about we start with adding node selectors to the kube-prometheus-stack helm release and the flux bootstrap?
https://fluxcd.io/flux/installation/configuration/boostrap-customization/
If adding a node selector for each component becomes too hard to manage we can look at alternatives later.
I have seen usage of scheduling profile (to reduce the usage of node selector and ... by just modifying the scheduling pofile) but it has a major downside (no support for daemonset pods) https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#node-affinity-per-scheduling-profile
check for the NOTEs section
cc @rossf7 @nikimanoledaki @AntonioDiTuri
I need to modify the fluxcd manifest for installing prometheus, kepler, ... even if we do that prometheus exporter will be a daemonset thus present in every node, not sure with kepler!
@dipankardas011 The node selectors are needed in the kube-prometheus-stack helm release and also for the flux components.
https://github.com/fluxcd/flux2/issues/2252#issuecomment-1002790427 https://fluxcd.io/flux/installation/configuration/boostrap-customization/
even if we do that prometheus exporter will be a daemonset thus present in every node, not sure with kepler!
It's fine for the kepler DS to schedule pods on all nodes. This is so we can measure the overall energy consumption of the cluster.
This issue is to create separate kubernetes nodes for our internal stack (Flux / Prometheus) and Falco (first project we're measuring).
Node isolation is important to ensure we can measure the footprint of projects accurately.
See https://github.com/falcosecurity/cncf-green-review-testing/issues/2 for Falco node requirements
Node requirements
The nodes will be managed by
opentofu
and have these names and node labels.We will start with using node labels and selectors for placing pods on nodes
but we may also need to introduce node taints and tolerations.