kubernetes-sigs / kubebuilder

Kubebuilder - SDK for building Kubernetes APIs using CRDs
http://book.kubebuilder.io
Apache License 2.0
7.77k stars 1.43k forks source link

Grafana dashboard for visualizing controller-runtime metrics #2183

Closed robbie-demuth closed 2 years ago

robbie-demuth commented 3 years ago

What do you want to happen?

The Kubebuilder book has a page dedicated to metrics. It explains how controller-runtime publishes metrics, how to protect them using kube-auth-proxy, and how to export them for Prometheus.

All operators built using Kubebuilder publish the same metrics (albeit with different dimensions for some labels). Many teams would benefit from visualizations built on top of these metrics like:

It would be great if Kubebuilder provided a Grafana dashboard (and made it available on https://grafana.com/grafana/dashboards) that teams could use to quickly visualize operator metrics. This would prevent teams using Grafana from each having to build their own dashboards.

I wasn't sure whether to open this issue against kubebuilder or controller-runtime. I'm more than happy to move it to controller-runtime if desired.

Thanks!

Extra Labels

No response

camilamacedo86 commented 3 years ago

That would be here. Let's raise that in the Kubebuilder, Controller Runtime, and Controller Tools Meeting (Agenda doc: https://docs.google.com/document/d/1Ih-2cgg1bUrLwLVTB9tADlPcVdgnuMNBGbUl4D-0TIk/edit#heading=h.dp4gt7gj3cn )

camilamacedo86 commented 3 years ago

Hi @robbie-demuth,

In the Kubebuilder, Controller Runtime, and Controller Tools Meeting, this RFE was discussed and seems that it could fit well as an optional plugin. Please, feel free to check the https://book.kubebuilder.io/plugins/plugins.html and contribute with if you wish. It is not an RFE that was decided to prioritize into the backlog, however, if anyone would like to push a new plugin able to provide this feature that would be great. Your contribution is very welcome here.

Also, feel free to reach out to us in the kubebuilder channel if you need help on that.

camilamacedo86 commented 2 years ago

/remove-lifecycle rotten

varshaprasad96 commented 2 years ago

Hi @iamrajiv. That's great! We have added this to the GSoC list of projects. The following links may help you give an idea on various projects which we will digging into to accomplish having Grafana dashboard as a plugin to expose metrics:

  1. Controller Runtime: https://github.com/kubernetes-sigs/controller-runtime
  2. How to expose metrics from Prometheus to Grafana dashboard: https://prometheus.io/docs/visualization/grafana/
  3. The metrics exposed by KB: https://book.kubebuilder.io/reference/metrics.html
  4. Plugin architecture used by KB: https://book.kubebuilder.io/plugins/plugins.html
LeoLiuYan commented 2 years ago

@iamrajiv @varshaprasad96 That's great. It will be very helpful to us.

jj551 commented 2 years ago

@varshaprasad96 That's great

camilamacedo86 commented 2 years ago

@varshaprasad96, @rashmigottipati and @Kavinjsir, I was thinking about this one and following the proposed solution:

Proposed Solution:

When we run: $ kubebuilder edit -–plugins=grafana/alpha-v1 OR kubebuilder init -–plugins=grafana/alpha-v1

We will:

It would allow this plugin to be used with any project which was built using Controller runtime no matter the language/layout done by the plugin.

By following this approach we attend to the only requirement discussed in the issue which does not scaffold the grafana files by default and create an Optional plugin to do so.

IMPORTANT NOTE: We actually discuss this in the past and the community accepted that already the only pre-requirement is it be optional. So that it would be our first example of optional plugins and how we can do nice things like that. See https://github.com/kubernetes-sigs/kubebuilder/issues/2183#issuecomment-863531273.

What about the details/constraints: (How/Where to implement it)

WDYT?

varshaprasad96 commented 2 years ago

@camilamacedo86 Agreed with this. Just another point as a follow up:

  1. The plugin when enabled helps users to set up one dashboard with controller-runtime metrics, and another with custom metrics - all steps to do the second one are documented.
  2. We can test this plugin against Ansible or any other plugin in addition to go, which export some sort of metrics on an endpoint.
Kavinjsir commented 2 years ago

@camilamacedo86 Thx for guidance with details! Just initialized a PR for an enhanced proposal based on the your designs and my origin proposal.

I'm good with your plans of the implementation. Starting with alpha plugin based on the prerequisites of metrics exposure can be an easy landing.

Some brief questions: 1.

We actually discuss this in the past and the community accepted that already the only pre-requirement is it be optional. So that it would be our first example of optional plugins and how we can do nice things like that.

What does optional mean and why do we want to make it that way? (Say for current plugins, users can trigger them "optionally" with flags --plugin=...; hence I'm wondering what this explicit optional means?)

2.

We can have Dashabords scaffold all default metrics...

Does that mean we may provide multiple dashboards to display panels on different perspectives with all default metrics applied?

  1. In this early stage, is it enough to barely provide the raw json file that can be loaded directly in Grafana Web UI?
camilamacedo86 commented 2 years ago

HI @Kavinjsir,

That is great! Following the comments inline.

What does optional mean and why do we want to make it that way?

Optional means that the plugin will not be called in the default scaffold. It means that:

Why do it in this way?

Following a few reasons

Does that mean we may provide multiple dashboards to display panels on different perspectives with all default metrics applied? In this early stage, is it enough to barely provide the raw JSON file that can be loaded directly in Grafana Web UI?

When we say default scaffold that means that by using the plugin it would create the directory and the files with the content required to be used with grafana. So that users would only apply the content on the cluster and check this result.

We are not specifying here how the dashboards should be. However, we show make sense we have in mind:

Is not possible we have the content scaffold and we have a target to just apply the manifests on the cluster and then check the dashboards such as https://grafana.com/docs/grafana/latest/variables/?

Please, let us know wdyt? And if that makes sense? Also, feel free to reach out to us via slack if you need.

camilamacedo86 commented 2 years ago

Following the metrics returned for an Operator:

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 10.96.251.224:8443...
* TCP_NODELAY set
* Connected to e2e-gdcf-controller-manager-metrics-service.e2e-gdcf-system.svc (10.96.251.224) port 8443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* error setting certificate verify locations, continuing anyway:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: none
} [5 bytes data]
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* TLSv1.3 (IN), TLS handshake, Server hello (2):
{ [122 bytes data]
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
{ [15 bytes data]
* TLSv1.3 (IN), TLS handshake, Request CERT (13):
{ [45 bytes data]
* TLSv1.3 (IN), TLS handshake, Certificate (11):
{ [1780 bytes data]
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
{ [264 bytes data]
* TLSv1.3 (IN), TLS handshake, Finished (20):
{ [36 bytes data]
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
} [1 bytes data]
* TLSv1.3 (OUT), TLS handshake, Certificate (11):
} [8 bytes data]
* TLSv1.3 (OUT), TLS handshake, Finished (20):
} [36 bytes data]
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=e2e-gdcf-controller-manager-7f74cfc6dd-qgcm6@1656145994
*  start date: Jun 25 07:33:13 2022 GMT
*  expire date: Jun 25 07:33:13 2023 GMT
*  issuer: CN=e2e-gdcf-controller-manager-7f74cfc6dd-qgcm6-ca@1656145993
*  SSL certificate verify result: self signed certificate in certificate chain (19), continuing anyway.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
} [5 bytes data]
* Using Stream ID: 1 (easy handle 0x55e3941bbb40)
} [5 bytes data]
> GET /metrics HTTP/2
> Host: e2e-gdcf-controller-manager-metrics-service.e2e-gdcf-system.svc:8443
> user-agent: curl/7.68.0-DEV
> accept: */*
> authorization: Bearer eyJhbGciOiJSUzI1NiIsImtpZCI6IldJWW1IN2hpVE1WNmtFalFKRGE5Y01RUVExZXNIQW0tV25UdXpzYk9zcmsifQ.eyJhdWQiOlsiaHR0cHM6Ly9rdWJlcm5ldGVzLmRlZmF1bHQuc3ZjLmNsdXN0ZXIubG9jYWwiXSwiZXhwIjoxNjU2MTQ5NjE4LCJpYXQiOjE2NTYxNDYwMTgsImlzcyI6Imh0dHBzOi8va3ViZXJuZXRlcy5kZWZhdWx0LnN2Yy5jbHVzdGVyLmxvY2FsIiwia3ViZXJuZXRlcy5pbyI6eyJuYW1lc3BhY2UiOiJlMmUtZ2RjZi1zeXN0ZW0iLCJzZXJ2aWNlYWNjb3VudCI6eyJuYW1lIjoiZTJlLWdkY2YtY29udHJvbGxlci1tYW5hZ2VyIiwidWlkIjoiNWI5OTVkOWEtYzA3Yy00Mjg2LWFkMmQtNTI5ZWU2ODQ5NmQ2In19LCJuYmYiOjE2NTYxNDYwMTgsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDplMmUtZ2RjZi1zeXN0ZW06ZTJlLWdkY2YtY29udHJvbGxlci1tYW5hZ2VyIn0.Q4YFQSDeXW9dBgecfU8hNBvWIDNWG7AVqCj_9WiXyBGAruDzpEsVFJCBFhGdUo33jcSdh3-6rG83U_jTM2FnifXf_53M-K9-9eb2spkeZHc9YBlMN1opvXYykZU65x9VQU_lJh5LFKcfAf3GUeszM1sYEfaCHTZuZrl8qwLdBYFpHEP440Ii7hU_NjwtuuW9mE7fjAMwZxOiRzHmS-9bI6HwQM5NPAjCevEOE-qhGOJqhretFPKUM1a9hHrx3nc_AQUN0kgP0cpDFLdGUP4FwbI6Q0brzi_Tav9WvjAKMYUgpv189-JRrmIWb97zF98aJQ6E4Gih-wYaBkqiWvzYTA
> 
{ [5 bytes data]
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
{ [130 bytes data]
* Connection state changed (MAX_CONCURRENT_STREAMS == 250)!
} [5 bytes data]
< HTTP/2 200 
< content-type: text/plain; version=0.0.4; charset=utf-8
# HELP certwatcher_read_certificate_errors_total Total number of certificate read errors
< date: Sat, 25 Jun 2022 08:33:39 GMT
# TYPE certwatcher_read_certificate_errors_total counter
< 
certwatcher_read_certificate_errors_total 0
{ [5 bytes data]
# HELP certwatcher_read_certificate_total Total number of certificate reads
# TYPE certwatcher_read_certificate_total counter
certwatcher_read_certificate_total 0
# HELP controller_runtime_active_workers Number of currently used workers per controller
# TYPE controller_runtime_active_workers gauge
controller_runtime_active_workers{controller="foogdcf"} 0
# HELP controller_runtime_max_concurrent_reconciles Maximum number of concurrent reconciles per controller
# TYPE controller_runtime_max_concurrent_reconciles gauge
controller_runtime_max_concurrent_reconciles{controller="foogdcf"} 1
# HELP controller_runtime_reconcile_errors_total Total number of reconciliation errors per controller
# TYPE controller_runtime_reconcile_errors_total counter
controller_runtime_reconcile_errors_total{controller="foogdcf"} 0
# HELP controller_runtime_reconcile_time_seconds Length of time per reconciliation per controller
# TYPE controller_runtime_reconcile_time_seconds histogram
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="0.005"} 5
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="0.01"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="0.025"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="0.05"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="0.1"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="0.15"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="0.2"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="0.25"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="0.3"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="0.35"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="0.4"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="0.45"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="0.5"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="0.6"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="0.7"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="0.8"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="0.9"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="1"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="1.25"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="1.5"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="1.75"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="2"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="2.5"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="3"} 6

100 24702    0 24702    0     0   893k      0 --:--:-- --:--:-- --:--:--  893k
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="3.5"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="4"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="4.5"} 6
* Connection #0 to host e2e-gdcf-controller-manager-metrics-service.e2e-gdcf-system.svc left intact
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="5"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="6"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="7"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="8"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="9"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="10"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="15"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="20"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="25"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="30"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="40"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="50"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="60"} 6
controller_runtime_reconcile_time_seconds_bucket{controller="foogdcf",le="+Inf"} 6
controller_runtime_reconcile_time_seconds_sum{controller="foogdcf"} 0.009288
controller_runtime_reconcile_time_seconds_count{controller="foogdcf"} 6
# HELP controller_runtime_reconcile_total Total number of reconciliations per controller
# TYPE controller_runtime_reconcile_total counter
controller_runtime_reconcile_total{controller="foogdcf",result="error"} 0
controller_runtime_reconcile_total{controller="foogdcf",result="requeue"} 1
controller_runtime_reconcile_total{controller="foogdcf",result="requeue_after"} 0
controller_runtime_reconcile_total{controller="foogdcf",result="success"} 5
# HELP go_gc_cycles_automatic_gc_cycles_total Count of completed GC cycles generated by the Go runtime.
# TYPE go_gc_cycles_automatic_gc_cycles_total counter
go_gc_cycles_automatic_gc_cycles_total 5
# HELP go_gc_cycles_forced_gc_cycles_total Count of completed GC cycles forced by the application.
# TYPE go_gc_cycles_forced_gc_cycles_total counter
go_gc_cycles_forced_gc_cycles_total 0
# HELP go_gc_cycles_total_gc_cycles_total Count of all completed GC cycles.
# TYPE go_gc_cycles_total_gc_cycles_total counter
go_gc_cycles_total_gc_cycles_total 5
# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0.0001536
go_gc_duration_seconds{quantile="0.25"} 0.0002285
go_gc_duration_seconds{quantile="0.5"} 0.0003148
go_gc_duration_seconds{quantile="0.75"} 0.0003314
go_gc_duration_seconds{quantile="1"} 0.0004237
go_gc_duration_seconds_sum 0.001452
go_gc_duration_seconds_count 5
# HELP go_gc_heap_allocs_by_size_bytes_total Distribution of heap allocations by approximate size. Note that this does not include tiny objects as defined by /gc/heap/tiny/allocs:objects, only tiny blocks.
# TYPE go_gc_heap_allocs_by_size_bytes_total histogram
go_gc_heap_allocs_by_size_bytes_total_bucket{le="8.999999999999998"} 6491
go_gc_heap_allocs_by_size_bytes_total_bucket{le="24.999999999999996"} 26517
go_gc_heap_allocs_by_size_bytes_total_bucket{le="64.99999999999999"} 43784
go_gc_heap_allocs_by_size_bytes_total_bucket{le="144.99999999999997"} 53714
go_gc_heap_allocs_by_size_bytes_total_bucket{le="320.99999999999994"} 56966
go_gc_heap_allocs_by_size_bytes_total_bucket{le="704.9999999999999"} 58770
go_gc_heap_allocs_by_size_bytes_total_bucket{le="1536.9999999999998"} 59742
go_gc_heap_allocs_by_size_bytes_total_bucket{le="3200.9999999999995"} 60044
go_gc_heap_allocs_by_size_bytes_total_bucket{le="6528.999999999999"} 60168
go_gc_heap_allocs_by_size_bytes_total_bucket{le="13568.999999999998"} 60228
go_gc_heap_allocs_by_size_bytes_total_bucket{le="27264.999999999996"} 60259
go_gc_heap_allocs_by_size_bytes_total_bucket{le="+Inf"} 60282
go_gc_heap_allocs_by_size_bytes_total_sum 9.208104e+06
go_gc_heap_allocs_by_size_bytes_total_count 60282
# HELP go_gc_heap_allocs_bytes_total Cumulative sum of memory allocated to the heap by the application.
# TYPE go_gc_heap_allocs_bytes_total counter
go_gc_heap_allocs_bytes_total 9.208104e+06
# HELP go_gc_heap_allocs_objects_total Cumulative count of heap allocations triggered by the application. Note that this does not include tiny objects as defined by /gc/heap/tiny/allocs:objects, only tiny blocks.
# TYPE go_gc_heap_allocs_objects_total counter
go_gc_heap_allocs_objects_total 60282
# HELP go_gc_heap_frees_by_size_bytes_total Distribution of freed heap allocations by approximate size. Note that this does not include tiny objects as defined by /gc/heap/tiny/allocs:objects, only tiny blocks.
# TYPE go_gc_heap_frees_by_size_bytes_total histogram
go_gc_heap_frees_by_size_bytes_total_bucket{le="8.999999999999998"} 2671
go_gc_heap_frees_by_size_bytes_total_bucket{le="24.999999999999996"} 14812
go_gc_heap_frees_by_size_bytes_total_bucket{le="64.99999999999999"} 24875
go_gc_heap_frees_by_size_bytes_total_bucket{le="144.99999999999997"} 32316
go_gc_heap_frees_by_size_bytes_total_bucket{le="320.99999999999994"} 33974
go_gc_heap_frees_by_size_bytes_total_bucket{le="704.9999999999999"} 35202
go_gc_heap_frees_by_size_bytes_total_bucket{le="1536.9999999999998"} 35794
go_gc_heap_frees_by_size_bytes_total_bucket{le="3200.9999999999995"} 35901
go_gc_heap_frees_by_size_bytes_total_bucket{le="6528.999999999999"} 35969
go_gc_heap_frees_by_size_bytes_total_bucket{le="13568.999999999998"} 35999
go_gc_heap_frees_by_size_bytes_total_bucket{le="27264.999999999996"} 36012
go_gc_heap_frees_by_size_bytes_total_bucket{le="+Inf"} 36019
go_gc_heap_frees_by_size_bytes_total_sum 4.470992e+06
go_gc_heap_frees_by_size_bytes_total_count 36019
# HELP go_gc_heap_frees_bytes_total Cumulative sum of heap memory freed by the garbage collector.
# TYPE go_gc_heap_frees_bytes_total counter
go_gc_heap_frees_bytes_total 4.470992e+06
# HELP go_gc_heap_frees_objects_total Cumulative count of heap allocations whose storage was freed by the garbage collector. Note that this does not include tiny objects as defined by /gc/heap/tiny/allocs:objects, only tiny blocks.
# TYPE go_gc_heap_frees_objects_total counter
go_gc_heap_frees_objects_total 36019
# HELP go_gc_heap_goal_bytes Heap size target for the end of the GC cycle.
# TYPE go_gc_heap_goal_bytes gauge
go_gc_heap_goal_bytes 8.74592e+06
# HELP go_gc_heap_objects_objects Number of objects, live or unswept, occupying heap memory.
# TYPE go_gc_heap_objects_objects gauge
go_gc_heap_objects_objects 24263
# HELP go_gc_heap_tiny_allocs_objects_total Count of small allocations that are packed together into blocks. These allocations are counted separately from other allocations because each individual allocation is not tracked by the runtime, only their block. Each block is already accounted for in allocs-by-size and frees-by-size.
# TYPE go_gc_heap_tiny_allocs_objects_total counter
go_gc_heap_tiny_allocs_objects_total 3855
# HELP go_gc_pauses_seconds_total Distribution individual GC-related stop-the-world pause latencies.
# TYPE go_gc_pauses_seconds_total histogram
go_gc_pauses_seconds_total_bucket{le="-5e-324"} 0
go_gc_pauses_seconds_total_bucket{le="9.999999999999999e-10"} 0
go_gc_pauses_seconds_total_bucket{le="9.999999999999999e-09"} 0
go_gc_pauses_seconds_total_bucket{le="9.999999999999998e-08"} 0
go_gc_pauses_seconds_total_bucket{le="1.0239999999999999e-06"} 0
go_gc_pauses_seconds_total_bucket{le="1.0239999999999999e-05"} 0
go_gc_pauses_seconds_total_bucket{le="0.00010239999999999998"} 5
go_gc_pauses_seconds_total_bucket{le="0.0010485759999999998"} 10
go_gc_pauses_seconds_total_bucket{le="0.010485759999999998"} 10
go_gc_pauses_seconds_total_bucket{le="0.10485759999999998"} 10
go_gc_pauses_seconds_total_bucket{le="+Inf"} 10
go_gc_pauses_seconds_total_sum NaN
go_gc_pauses_seconds_total_count 10
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 50
# HELP go_info Information about the Go environment.
# TYPE go_info gauge
go_info{version="go1.18.3"} 1
# HELP go_memory_classes_heap_free_bytes Memory that is completely free and eligible to be returned to the underlying system, but has not been. This metric is the runtime's estimate of free address space that is backed by physical memory.
# TYPE go_memory_classes_heap_free_bytes gauge
go_memory_classes_heap_free_bytes 999424
# HELP go_memory_classes_heap_objects_bytes Memory occupied by live objects and dead objects that have not yet been marked free by the garbage collector.
# TYPE go_memory_classes_heap_objects_bytes gauge
go_memory_classes_heap_objects_bytes 4.737112e+06
# HELP go_memory_classes_heap_released_bytes Memory that is completely free and has been returned to the underlying system. This metric is the runtime's estimate of free address space that is still mapped into the process, but is not backed by physical memory.
# TYPE go_memory_classes_heap_released_bytes gauge
go_memory_classes_heap_released_bytes 4.071424e+06
# HELP go_memory_classes_heap_stacks_bytes Memory allocated from the heap that is reserved for stack space, whether or not it is currently in-use.
# TYPE go_memory_classes_heap_stacks_bytes gauge
go_memory_classes_heap_stacks_bytes 917504
# HELP go_memory_classes_heap_unused_bytes Memory that is reserved for heap objects but is not currently used to hold heap objects.
# TYPE go_memory_classes_heap_unused_bytes gauge
go_memory_classes_heap_unused_bytes 1.857448e+06
# HELP go_memory_classes_metadata_mcache_free_bytes Memory that is reserved for runtime mcache structures, but not in-use.
# TYPE go_memory_classes_metadata_mcache_free_bytes gauge
go_memory_classes_metadata_mcache_free_bytes 6000
# HELP go_memory_classes_metadata_mcache_inuse_bytes Memory that is occupied by runtime mcache structures that are currently being used.
# TYPE go_memory_classes_metadata_mcache_inuse_bytes gauge
go_memory_classes_metadata_mcache_inuse_bytes 9600
# HELP go_memory_classes_metadata_mspan_free_bytes Memory that is reserved for runtime mspan structures, but not in-use.
# TYPE go_memory_classes_metadata_mspan_free_bytes gauge
go_memory_classes_metadata_mspan_free_bytes 14824
# HELP go_memory_classes_metadata_mspan_inuse_bytes Memory that is occupied by runtime mspan structures that are currently being used.
# TYPE go_memory_classes_metadata_mspan_inuse_bytes gauge
go_memory_classes_metadata_mspan_inuse_bytes 115736
# HELP go_memory_classes_metadata_other_bytes Memory that is reserved for or used to hold runtime metadata.
# TYPE go_memory_classes_metadata_other_bytes gauge
go_memory_classes_metadata_other_bytes 5.209216e+06
# HELP go_memory_classes_os_stacks_bytes Stack memory allocated by the underlying operating system.
# TYPE go_memory_classes_os_stacks_bytes gauge
go_memory_classes_os_stacks_bytes 0
# HELP go_memory_classes_other_bytes Memory used by execution trace buffers, structures for debugging the runtime, finalizer and profiler specials, and more.
# TYPE go_memory_classes_other_bytes gauge
go_memory_classes_other_bytes 1.67076e+06
# HELP go_memory_classes_profiling_buckets_bytes Memory that is used by the stack trace hash map used for profiling.
# TYPE go_memory_classes_profiling_buckets_bytes gauge
go_memory_classes_profiling_buckets_bytes 5432
# HELP go_memory_classes_total_bytes All memory mapped by the Go runtime into the current process as read-write. Note that this does not include memory mapped by code called via cgo or via the syscall package. Sum of all metrics in /memory/classes.
# TYPE go_memory_classes_total_bytes gauge
go_memory_classes_total_bytes 1.961448e+07
# HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
# TYPE go_memstats_alloc_bytes gauge
go_memstats_alloc_bytes 4.737112e+06
# HELP go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed.
# TYPE go_memstats_alloc_bytes_total counter
go_memstats_alloc_bytes_total 9.208104e+06
# HELP go_memstats_buck_hash_sys_bytes Number of bytes used by the profiling bucket hash table.
# TYPE go_memstats_buck_hash_sys_bytes gauge
go_memstats_buck_hash_sys_bytes 5432
# HELP go_memstats_frees_total Total number of frees.
# TYPE go_memstats_frees_total counter
go_memstats_frees_total 39874
# HELP go_memstats_gc_cpu_fraction The fraction of this program's available CPU time used by the GC since the program started.
# TYPE go_memstats_gc_cpu_fraction gauge
go_memstats_gc_cpu_fraction 0
# HELP go_memstats_gc_sys_bytes Number of bytes used for garbage collection system metadata.
# TYPE go_memstats_gc_sys_bytes gauge
go_memstats_gc_sys_bytes 5.209216e+06
# HELP go_memstats_heap_alloc_bytes Number of heap bytes allocated and still in use.
# TYPE go_memstats_heap_alloc_bytes gauge
go_memstats_heap_alloc_bytes 4.737112e+06
# HELP go_memstats_heap_idle_bytes Number of heap bytes waiting to be used.
# TYPE go_memstats_heap_idle_bytes gauge
go_memstats_heap_idle_bytes 5.070848e+06
# HELP go_memstats_heap_inuse_bytes Number of heap bytes that are in use.
# TYPE go_memstats_heap_inuse_bytes gauge
go_memstats_heap_inuse_bytes 6.59456e+06
# HELP go_memstats_heap_objects Number of allocated objects.
# TYPE go_memstats_heap_objects gauge
go_memstats_heap_objects 24263
# HELP go_memstats_heap_released_bytes Number of heap bytes released to OS.
# TYPE go_memstats_heap_released_bytes gauge
go_memstats_heap_released_bytes 4.071424e+06
# HELP go_memstats_heap_sys_bytes Number of heap bytes obtained from system.
# TYPE go_memstats_heap_sys_bytes gauge
go_memstats_heap_sys_bytes 1.1665408e+07
# HELP go_memstats_last_gc_time_seconds Number of seconds since 1970 of last garbage collection.
# TYPE go_memstats_last_gc_time_seconds gauge
go_memstats_last_gc_time_seconds 1.6561460180810273e+09
# HELP go_memstats_lookups_total Total number of pointer lookups.
# TYPE go_memstats_lookups_total counter
go_memstats_lookups_total 0
# HELP go_memstats_mallocs_total Total number of mallocs.
# TYPE go_memstats_mallocs_total counter
go_memstats_mallocs_total 64137
# HELP go_memstats_mcache_inuse_bytes Number of bytes in use by mcache structures.
# TYPE go_memstats_mcache_inuse_bytes gauge
go_memstats_mcache_inuse_bytes 9600
# HELP go_memstats_mcache_sys_bytes Number of bytes used for mcache structures obtained from system.
# TYPE go_memstats_mcache_sys_bytes gauge
go_memstats_mcache_sys_bytes 15600
# HELP go_memstats_mspan_inuse_bytes Number of bytes in use by mspan structures.
# TYPE go_memstats_mspan_inuse_bytes gauge
go_memstats_mspan_inuse_bytes 115736
# HELP go_memstats_mspan_sys_bytes Number of bytes used for mspan structures obtained from system.
# TYPE go_memstats_mspan_sys_bytes gauge
go_memstats_mspan_sys_bytes 130560
# HELP go_memstats_next_gc_bytes Number of heap bytes when next garbage collection will take place.
# TYPE go_memstats_next_gc_bytes gauge
go_memstats_next_gc_bytes 8.74592e+06
# HELP go_memstats_other_sys_bytes Number of bytes used for other system allocations.
# TYPE go_memstats_other_sys_bytes gauge
go_memstats_other_sys_bytes 1.67076e+06
# HELP go_memstats_stack_inuse_bytes Number of bytes in use by the stack allocator.
# TYPE go_memstats_stack_inuse_bytes gauge
go_memstats_stack_inuse_bytes 917504
# HELP go_memstats_stack_sys_bytes Number of bytes obtained from system for stack allocator.
# TYPE go_memstats_stack_sys_bytes gauge
go_memstats_stack_sys_bytes 917504
# HELP go_memstats_sys_bytes Number of bytes obtained from system.
# TYPE go_memstats_sys_bytes gauge
go_memstats_sys_bytes 1.961448e+07
# HELP go_sched_goroutines_goroutines Count of live goroutines.
# TYPE go_sched_goroutines_goroutines gauge
go_sched_goroutines_goroutines 50
# HELP go_sched_latencies_seconds Distribution of the time goroutines have spent in the scheduler in a runnable state before actually running.
# TYPE go_sched_latencies_seconds histogram
go_sched_latencies_seconds_bucket{le="-5e-324"} 0
go_sched_latencies_seconds_bucket{le="9.999999999999999e-10"} 100
go_sched_latencies_seconds_bucket{le="9.999999999999999e-09"} 100
go_sched_latencies_seconds_bucket{le="9.999999999999998e-08"} 100
go_sched_latencies_seconds_bucket{le="1.0239999999999999e-06"} 100
go_sched_latencies_seconds_bucket{le="1.0239999999999999e-05"} 242
go_sched_latencies_seconds_bucket{le="0.00010239999999999998"} 342
go_sched_latencies_seconds_bucket{le="0.0010485759999999998"} 376
go_sched_latencies_seconds_bucket{le="0.010485759999999998"} 383
go_sched_latencies_seconds_bucket{le="0.10485759999999998"} 384
go_sched_latencies_seconds_bucket{le="+Inf"} 384
go_sched_latencies_seconds_sum NaN
go_sched_latencies_seconds_count 384
# HELP go_threads Number of OS threads created.
# TYPE go_threads gauge
go_threads 12
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 0.21
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1.048576e+06
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 11
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 3.1571968e+07
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.65614599236e+09
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 7.5425792e+08
# HELP process_virtual_memory_max_bytes Maximum amount of virtual memory available in bytes.
# TYPE process_virtual_memory_max_bytes gauge
process_virtual_memory_max_bytes 1.8446744073709552e+19
# HELP rest_client_requests_total Number of HTTP requests, partitioned by status code, method, and host.
# TYPE rest_client_requests_total counter
rest_client_requests_total{code="200",host="10.96.0.1:443",method="GET"} 52
rest_client_requests_total{code="200",host="10.96.0.1:443",method="PUT"} 13
rest_client_requests_total{code="201",host="10.96.0.1:443",method="POST"} 3
rest_client_requests_total{code="404",host="10.96.0.1:443",method="GET"} 1
# HELP workqueue_adds_total Total number of adds handled by workqueue
# TYPE workqueue_adds_total counter
workqueue_adds_total{name="foogdcf"} 6
# HELP workqueue_depth Current depth of workqueue
# TYPE workqueue_depth gauge
workqueue_depth{name="foogdcf"} 0
# HELP workqueue_longest_running_processor_seconds How many seconds has the longest running processor for workqueue been running.
# TYPE workqueue_longest_running_processor_seconds gauge
workqueue_longest_running_processor_seconds{name="foogdcf"} 0
# HELP workqueue_queue_duration_seconds How long in seconds an item stays in workqueue before being requested
# TYPE workqueue_queue_duration_seconds histogram
workqueue_queue_duration_seconds_bucket{name="foogdcf",le="1e-08"} 0
workqueue_queue_duration_seconds_bucket{name="foogdcf",le="1e-07"} 0
workqueue_queue_duration_seconds_bucket{name="foogdcf",le="1e-06"} 0
workqueue_queue_duration_seconds_bucket{name="foogdcf",le="9.999999999999999e-06"} 0
workqueue_queue_duration_seconds_bucket{name="foogdcf",le="9.999999999999999e-05"} 4
workqueue_queue_duration_seconds_bucket{name="foogdcf",le="0.001"} 5
workqueue_queue_duration_seconds_bucket{name="foogdcf",le="0.01"} 6
workqueue_queue_duration_seconds_bucket{name="foogdcf",le="0.1"} 6
workqueue_queue_duration_seconds_bucket{name="foogdcf",le="1"} 6
workqueue_queue_duration_seconds_bucket{name="foogdcf",le="10"} 6
workqueue_queue_duration_seconds_bucket{name="foogdcf",le="+Inf"} 6
workqueue_queue_duration_seconds_sum{name="foogdcf"} 0.0016871
workqueue_queue_duration_seconds_count{name="foogdcf"} 6
# HELP workqueue_retries_total Total number of retries handled by workqueue
# TYPE workqueue_retries_total counter
workqueue_retries_total{name="foogdcf"} 1
# HELP workqueue_unfinished_work_seconds How many seconds of work has been done that is in progress and hasn't been observed by work_duration. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases.
# TYPE workqueue_unfinished_work_seconds gauge
workqueue_unfinished_work_seconds{name="foogdcf"} 0
# HELP workqueue_work_duration_seconds How long in seconds processing an item from workqueue takes.
# TYPE workqueue_work_duration_seconds histogram
workqueue_work_duration_seconds_bucket{name="foogdcf",le="1e-08"} 0
workqueue_work_duration_seconds_bucket{name="foogdcf",le="1e-07"} 0
workqueue_work_duration_seconds_bucket{name="foogdcf",le="1e-06"} 0
workqueue_work_duration_seconds_bucket{name="foogdcf",le="9.999999999999999e-06"} 0
workqueue_work_duration_seconds_bucket{name="foogdcf",le="9.999999999999999e-05"} 1
workqueue_work_duration_seconds_bucket{name="foogdcf",le="0.001"} 4
workqueue_work_duration_seconds_bucket{name="foogdcf",le="0.01"} 6
workqueue_work_duration_seconds_bucket{name="foogdcf",le="0.1"} 6
workqueue_work_duration_seconds_bucket{name="foogdcf",le="1"} 6
workqueue_work_duration_seconds_bucket{name="foogdcf",le="10"} 6
workqueue_work_duration_seconds_bucket{name="foogdcf",le="+Inf"} 6
workqueue_work_duration_seconds_sum{name="foogdcf"} 0.009579800000000001
workqueue_work_duration_seconds_count{name="foogdcf"} 6

@varshaprasad96 @Kavinjsir

camilamacedo86 commented 2 years ago

@Kavinjsir we can close this one since the grafana plugin is merged available. Also, let's add here the info for those that are looking for to check how it works

a) ( Basic plugin functions ) https://www.youtube.com/watch?v=-w_JjcV8jXc&t=224s b) ( Feature to generate the graphs with custom metrics ) https://www.youtube.com/watch?v=x_0FHta2HXc

Closing this one as done. If someone would like to propose changes or find a bug please raise a new issue.