dotdc / grafana-dashboards-kubernetes

A set of modern Grafana dashboards for Kubernetes.
Apache License 2.0
2.63k stars 368 forks source link

[enhancement] Windows support #79

Open jkroepke opened 10 months ago

jkroepke commented 10 months ago

Describe the enhancement you'd like

I have some cluster with Windows nodes enabled. I would like to ask if I can add windows support or do you think it out of context here?

Unlike kubernetes-mixin, which have separate dashboard, I would like to add the Windows queries into the existing one. Thats possible by using queries with OR, e.g.:

sum(container_memory_working_set_bytes{cluster="$cluster",namespace=~"$namespace", image!="", pod=~"${created_by}.*"}) by (pod)
OR
<WINDOWS Query>

Additional context

Since I'm running multiple OS hybrid clusters, I would like to add PRs for windows pods here. I'm not expecting that the maintainers here provide support for Windows. Before start to work here, I would like to know if its getting accepted?

dotdc commented 10 months ago

Hi @jkroepke,

I'm not totally against it, but I have a few doubts...

Here are a few questions to start the discussion:

If anyone is also using Windows hosts, feel free to jump in the discussion with your thoughts!

jkroepke commented 10 months ago

If I understand correctly, you have clusters with both GNU/Linux and Windows node pools. Could you share your use-case? (just curious)

Sure! The Kubernetes Control Plane does not support Windows. I have to run Linux Node (e.g. for CoreDNS, ArgoCD) and Windows Node Pools for Application (Customer runs dotNET Application)

On Azure, a Kubernetes managed Services requires at-least one Linux Node and you can add additional Windows nodes. We are also quite common, that customers wants to use Standard infrastructure components like Redis, Solr or elasticsearch which are not running on Windows node. In general, we avoid Windows nodes as much as possible. Hybrid OS clusters are common, because mono-Windows cluster can't be exists.

Which dashboard(s) are you planning to add Windows support for?

Cluster, Namespace, Node, Pod

Why do you think having the queries in here would be better than having them in separate dashboards?

We have namespace which contains windows and linux containers. My personal opinion is that one dashboard provides a better user expericence.

Since kube-state-metrics provides request/limits for pods on any OS, Windows Pods are already included Pod CPU/Memory Request/Limits panels, but excluded on the usage panel. For separate dashboards, Requests/Limits from Windows Pods should be excluded then.

Example Panel: Real (linux only), Requests/Limits (any OS)

image

Will there be any OS specific panels?

No, except Windows is not providing metrics, some panels will be Linux specific. But I do not plan to add new panels.

After the initial work, would you be able to help Windows users if they have issues related to your change?

Yes, because out company has an upstream first culture. If other users have a bug, we may also have the bug, too. Of course, I will have.

The good thing is that the dashboard are providing only basic metrics. I don't plan to engineer the queries on my own. kubernetes-mixin provides some rules for windows which I will use as base. for the panel. I won't add a dependency against the kubernetes-mixin recording rules.


Since I expect some large changes with #15, I plan to start after the cluster variable work is finished.

dotdc commented 10 months ago

Hi @jkroepke, I'm fine with the idea, we can try to add Windows support, as soon as it doesn't impact the experience on Linux. Will just finish the work on the other issues before looking further into this.

jkroepke commented 10 months ago

Will just finish the work on the other issues before looking further into this.

I would prefer to split the Windows integration into different, smaller PRs. At minimum a PR for each dashboard. Let me know, when I can start with the cluster dashboard.

dotdc commented 10 months ago

@jkroepke The other issues/PRs are done, so you can start working on this whenever you want.

jkroepke commented 7 months ago

Hey @dotdc ,

In #103, I tired to split Linux and Windows queries when ever it was possible. Did you prefer this kind of solution or should we prefer more complex query as seen on the CPU by namespace queries?

dotdc commented 7 months ago

Hi @jkroepke, I think I prefer the split version, but if it's not possible to have that everywhere, maybe go for the other one to have consistency. What do you think ?

jkroepke commented 7 months ago

What do you think ?

The combind solution (one query for Windows and OS) results into very complex and un-understandable queries. Due complexity, external users may avoid to contribute improvement or they may destroy the Windows support by accident

dotdc commented 7 months ago

Agree. Do you think it's possible to split the combined queries like "CPU by namespace" ?

jkroepke commented 7 months ago

Maybe, no idea yet. I may have to ask at Grafana slack.

jkroepke commented 6 months ago

Do you think it's possible to split the combined queries like "CPU by namespace" ?

In theory yes, but the make it complex at Grafana side:

image
dotdc commented 6 months ago

Can we manage all the cases using combined queries?

If it's the case, it might be better to go for the combined queries with comments to have consistency. Here's an example :

image

What do you think ?

dotdc commented 6 months ago

Also, could be nice to add Windows support knowledge in the docs (README.md) for the other users.

dotdc commented 4 months ago

Hi @jkroepke,

Do you plan to add Windows support on the other dashboards?

jkroepke commented 4 months ago

Hi @dotdc,

in general, yes. But time is a bit limited and I had to take a focus on windows_exporter. Once I'm done, I will continue.

dotdc commented 4 months ago

That's fine, let me know.