showMultiCluster option should allow selecting multiple (and all) values

hwoarang commented 4 years ago

Right now, when showMultiCluster is set to true we can only select a single value from this variable. However, it would be beneficial to be able to select multiple values so we observe multiple clusters at the same time and potentially also allow includeAll to be able to look at all of them at once

My understanding is that the following template on the various dashboards would need to be modified to allow overrides but any input is more than welcomed

https://github.com/kubernetes-monitoring/kubernetes-mixin/blob/master/dashboards/resources/cluster.libsonnet#L7-L16

hwoarang commented 4 years ago

Also the queries will be to be modified from %(clusterLabel)s="$cluster" to %(clusterLabel)s=~"$cluster"

As such, I'd like to get some feedback and opinions on whether this would be acceptable before I start working on a PR

brancz commented 4 years ago

cc @csmarchbanks

csmarchbanks commented 4 years ago

Generally, the idea is that you start at the multi-cluster dashboard, and drill your way down to cluster/namespace/workload/pod. Is there something missing from the multi-cluster dashboard that could be improved upon?

hwoarang commented 4 years ago

@csmarchbanks They idea I had in mind is this. Say you have the same app running on clusterA and clusterB and you would like to be able to compare how they perform side by side. You do that by selecting both clusters from the drop down and you are able to see graphs which has both pods on them.

Say the dashboard uses the following query

rate(http_requests_total{cluster="A"}[5m])

then you only get graphs from a single cluster but by simply converting this to a multivalue and regex you can get both on the same dashboard

rate(http_requests_total{cluster=~"A|B"}[5m])

The good thing is that the query change also work when you select a single cluster so it wont break existing behavior but you will get the multi-cluster comparison for free if the query supports it.

csmarchbanks commented 4 years ago

For that use-case it is likely the namespace and workload dashboards would also need a cluster multi-select in order to compare. Would it also be good to make the namespace a mutli select at that point to compare similar workloads across two different namespaces?

I think this could be an interesting change, but am not sure the complexity is worth it. The changes will likely end up being rather invasive. Each data table and graph will need additional cluster labels as well to ensure drill downs and labels are appropriately unique. I'd be happy to see a prototype, and depending on the result discuss adding it!

hwoarang commented 4 years ago

For that use-case it is likely the namespace and workload dashboards would also need a cluster multi-select in order to compare. Would it also be good to make the namespace a mutli select at that point to compare similar workloads across two different namespaces?

I think this could be an interesting change, but am not sure the complexity is worth it. The changes will likely end up being rather invasive. Each data table and graph will need additional cluster labels as well to ensure drill downs and labels are appropriately unique. I'd be happy to see a prototype, and depending on the result discuss adding it!

It's true that this can explode in many different ways. What I had in mind is clusters which run nearly identical workloads such as canary releases, A/B, blue/green testing etc. In other words, the use case is for clusters that have a large number of similarities in which the only major difference is just the cluster name. Namespaces, services etc can be considered as 'constant' parameters.

johnpemberton commented 4 years ago

I'm keen on this feature too. We have a multi-cluster setup: two application clusters, where teams workloads live, and a management cluster, where CI/CD and other tooling lives. When teams deploy their applications, we use a 'deploy everywhere' strategy, so if they want 4 replicas, it'll put 2 in each of the application clusters and load balance outside the clusters using DNS.

It would be great to see the metrics for a particular namespaces/workloads across all clusters in a single view.

Is one option not just having a duplicate of the kubernetes-compute-resources-namespace-pods and some of the other dashboards but without the cluster variable? Appreciate that's more dashboards, but avoids the additional complexity required to support an all/multi-select option?

johnpemberton commented 4 years ago

I've had a crack at this today. By no means PR ready yet though.

Basic approach was to create a clusterSelector config setting based on the showMultiCluster config setting.

showMultiCluster was also used to enable multi selects for the cluster variable.

Dashboards seem to work fine, but drill down links don't: when you select multiple options for cluster in grafana, you get var-cluster=a&var-cluster=b&var-cluster=c in the URL query string, but the dashboard links are like this in the jsonnet:

link: '%(prefix)s/d/%(uid)s/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell' % { prefix: $._config.grafanaK8s.linkPrefix, uid: std.md5('k8s-resources-pod.json') }

I think if we could repeat that var-cluster query parameter based on what was currently selected, then this would work.

Branch comparison here: https://github.com/kubernetes-monitoring/kubernetes-mixin/compare/master...johnpemberton:master

johnpemberton commented 4 years ago

Don't think there's any way to fix the URL parameter issue on the drill down links without changes to grafana itself so that it accepts comma separated URL parameters, rather than repeated parameters.

github-actions[bot] commented 1 week ago

This issue has not had any activity in the past 30 days, so the stale label has been added to it.

The stale label will be removed if there is new activity
The issue will be closed in 7 days if there is no new activity
Add the keepalive label to exempt this issue from the stale check action

Thank you for your contributions!

kubernetes-monitoring / kubernetes-mixin

showMultiCluster option should allow selecting multiple (and all) values #444