Open borgez opened 3 years ago
+1
+1
Any workaround to get this working?
+1
+1
Change Prometheus service address
to cattle-prometheus/access-prometheus:80
may provides you some information (incomplete).
+1
I can confirm using cattle-prometheus/access-prometheus:80
provides some information:
The main cluster master/worker cpu/memory dashboard does not work
and neither does the nodes overview:
However, individual selections of single resources (pods, deployments, replicasets, daemonsets, statefulsets) works:
The individual selection of nodes just shows requests & limits but not the actual usage.
This is already super amazing though. I love this tool! I think a "rancher" preset would probably make sense in the future :)
Actually, it depends what default you choose - there seem to be difference between "Helm" and "Prometheus Operator", there are some things not working with one and some other things are not working with the other. I'd be happy to provide exact results what works with what if this would be needed.
+1
@steebchen I can confirm it works on v2 as well
for V2 it is
cattle-monitoring-system/rancher-monitoring-prometheus:9090
It is kinda weird that it doesn't show all information for V2, because it seems to me v2 is pretty much just like a pre-installed Prometheus. I wonder if we need to add lens specific metric
Related to #1865
@maxisam I changed it but it does not work. Rancher v 2.5.7 cattle-monitoring-system/rancher-monitoring-prometheus:9090 did you choose helm or prometheus operator?
@houshym with the latest lens and 2.5.7 It seems like I don't need to do anything. Just pick auto and it works. (most of it)
@maxisam lens version 4.2.0, Rancher 2.5.7 and it does not work, fresh install k8s
@houshym Mine is on 1.19.8 and I can see
Can someone from developers explain what Prometheus metrics are needed to display graphs for CPU/memory in Cluster view?
@nitrogear I would recommend looking at the files within https://github.com/lensapp/lens/tree/master/src/main/prometheus for the queries we use.
Rancher: v2.6.0 rancher-monitoring chart: v100.0.0+up16.6.0
Now, I am completely green when it comes to Rancher, Kubernetes, and Prometheus, but I'm persistent and decided to do some digging. It appears that the default rancher-monitoring chart does not set the node: label in the node-exporter configuration. Looking over the queries that lens uses, it looks like it expects a few node functions to return that label: node_memory_MemTotal_bytes and node_cpu_seconds_total and probably something I missed. Looking through the rancher-monitoring chart values.yaml I notice that there was a nodeExporter: relabelings: section that was commented out that had a __meta_kubernetes_pod_node_name renaming. So, I thought, what the heck, let's set that and see what happens. After a bit of back and forth with it, I got it to work.
nodeExporter:
relabelings:
- sourceLabels: [__meta_kubernetes_pod_node_name]
separator: ;
regex: ^(.*)$
targetLabel: node
replacement: $1
action: replace
Then, choosing prometheus-operator and setting the PROMETHEUS SERVICE ADDRESS to cattle-monitoring-system/rancher-monitoring-prometheus:9090 seems to make everything work.
I would love someone to verify this, though, because I went about changing a lot of different chart values before I found this one. It's possible I left one of those old changes in and that's what made this work.
@McFistyBuns Thanks for sharing! that is cool!
@McFistyBuns I finally have chance to test it. Unfortunately, it doesn't work for me. I still missed couple things like CPU and Disk
I think I solved this. Lets look for example at this: https://github.com/lensapp/lens/blob/master/src/main/prometheus/operator.ts#L53 And when you look at what variable rateAccuracy, it is set to 1m. Furthermore the default scrape interval is 1 minute. So the query will return empty result because I believe rate function needs at least 2 data point to work.
Armed with this knowledge I set the nodeExporter interval to 30 seconds. The resulting section looks like this:
nodeExporter:
enabled: true
jobLabel: jobLabel
serviceMonitor:
interval: 30s
metricRelabelings: null
proxyUrl: ''
relabelings:
- action: replace
regex: ^(.*)$
replacement: $1
separator: ;
sourceLabels:
- __meta_kubernetes_pod_node_name
targetLabel: node
scrapeTimeout: ''
You beat me to it. I just got back in to the office after the holidays and was going to mention I forgot I had changed that as well.
Thanks @McFistyBuns @kimmornetum , I can confirm it work with rancher-monitoring:9.4.203 on Rancher 2.5.8
Does anyone have a working config for Lens 5.5.4 and Rancher Server 2.6.6 with rancher-monitoring 100.1.2+up19.0.3? I can't seem to get memory under nodes to show any data, though it shows up under pods. Also, I can't get the Cluster Dash to populate 'in use' nodes.
This seems to be a problem for the basic helm install for prometheus which defaults to a 60s interval. Reducing to a 30s interval works for the reasons @kimmornetum mentioned above though I was confused why it was not working because https://github.com/lensapp/lens/blob/master/src/main/prometheus/helm.ts has the rateAccuracy
set to 5m
. My typescript knowledge is a little lacking but it seems that readonly
is not overriding the value in the parent class for getQuery
as the author of the code may have intended.
What would you like to be added: rancher monitoring integration
Why is this needed: lens not autodiscover rancher metrics
Environment you are Lens application on: