centreon / centreon-plugins

Collection of standard plugins to discover and gather cloud-to-edge metrics and status across your whole IT infrastructure.
https://www.centreon.com
Apache License 2.0
310 stars 274 forks source link

[cloud::google::gcp::management::stackdriver] how to monitor kubernetes on GCP? #1601

Closed joschi99 closed 3 years ago

joschi99 commented 5 years ago

How it is possible to monitor kubernetes metrics with stackdriver on GCP? For example we would like to monitor memory usage for kubernetes nodes: image Problem is that on the plugin the instance is a mandatory input and for example on this kubernetes metrics we do not have a instance but we have a Kubernetes Engine. image Someone could help?

joschi99 commented 5 years ago

I can successful interrogate instance metrics. For example:

./centreon_plugins.pl --plugin=cloud::google::gcp::management::stackdriver::plugin --mode=get-metrics --custommode=api --api='compute.googleapis.com' --key-file=/root/gcp-ive01-dev.json --instance='gke-gke-app200-dev-default-pool-2a84f477-ggw8' --metric='instance/cpu/utilization' --aggregation=average --timeframe=600
OK: Metric 'instance/cpu/utilization' of resource 'gke-gke-app200-dev-default-pool-2a84f477-ggw8' value is 0.112437522470464 | 'instance/cpu/utilization_average'=0.112437522470464;;;;

But interrogating Kubernetes is not working:

./centreon_plugins.pl --plugin=cloud::google::gcp::management::stackdriver::plugin --mode=get-metrics --custommode=api --api='kubernetes.io' --key-file=/root/gcp-ive01-dev.json --instance='gke-gke-app200-dev-default-pool-2a84f477-ggw8' --metric='node/cpu/core_usage_time' --aggregation=average --timeframe=600
UNKNOWN: Monitoring endpoint API return error code '404' (add --debug option for detailed message)
URL: 'https://monitoring.googleapis.com/v3/projects/gcp-ive01-dev/timeSeries/?filter=metric.type%20%3D%20%22kubernetes.io%2Fnode%2Fcpu%2Fcore_usage_time%22%20AND%20metric.labels.instance_name%20%3D%20starts_with%28gke-gke-app200-dev-default-pool-2a84f477-ggw8%29&interval.startTime=2019-07-31T06%3A54%3A36.000000Z&interval.endTime=2019-07-31T07%3A04%3A36.000000Z'
======> request send
GET https://monitoring.googleapis.com/v3/projects/gcp-ive01-dev/timeSeries/?filter=metric.type%20%3D%20%22kubernetes.io%2Fnode%2Fcpu%2Fcore_usage_time%22%20AND%20metric.labels.instance_name%20%3D%20starts_with%28gke-gke-app200-dev-default-pool-2a84f477-ggw8%29&interval.startTime=2019-07-31T06%3A54%3A36.000000Z&interval.endTime=2019-07-31T07%3A04%3A36.000000Z
Accept: application/json
Authorization: Bearer ya29.c.ElpWBxtNBZlHJGNAwOoJuHV-p9noICHZeLTOcM4_tnuIZo4Y3ZZFyi8_1mWgbTXp3oQBqTMHqp7SmABMsSmVxyS8w73f5ThJE5P1p-3QjiSxnesNCDcuEZui0ZE
User-Agent: centreon::plugins::backend::http::useragent

======> response done
HTTP/1.1 404 Not Found
Cache-Control: private
Date: Wed, 31 Jul 2019 07:04:36 GMT
Accept-Ranges: none
Server: ESF
Vary: X-Origin
Vary: Referer
Vary: Origin,Accept-Encoding
Content-Type: application/json; charset=UTF-8
Alt-Svc: quic=":443"; ma=2592000; v="46,43,39"
Client-Date: Wed, 31 Jul 2019 07:04:36 GMT
Client-Peer: 74.125.193.95:443
Client-Response-Num: 1
Client-SSL-Cert-Issuer: /C=US/O=Google Trust Services/CN=Google Internet Authority G3
Client-SSL-Cert-Subject: /C=US/ST=California/L=Mountain View/O=Google LLC/CN=edgecert.googleapis.com
Client-SSL-Cipher: ECDHE-ECDSA-AES128-GCM-SHA256
Client-SSL-Socket-Class: IO::Socket::SSL
Client-Transfer-Encoding: chunked
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 0

{
  "error": {
    "code": 404,
    "message": "The metric referenced by the provided filter is unknown. Check the metric name and labels.",
    "status": "NOT_FOUND"
  }
}
Error message : The metric referenced by the provided filter is unknown. Check the metric name and labels.
joschi99 commented 5 years ago

Someone has an idea how I can check the Kubernetes Metrics?

cgagnaire commented 5 years ago

Hi @joschi99, The documentation says that the Kubernetes metrics must be prefixed with "kubernetes.io/" (https://cloud.google.com/monitoring/api/metrics_kubernetes). Can you try this ? Thanks.

joschi99 commented 5 years ago

Hi @cgagnaire, I tried different options, but seem's not working

./centreon_plugins.pl --plugin=cloud::google::gcp::management::stackdriver::plugin --mode=get-metrics --custommode=api --api='kubernetes.io' --key-file=/root/gcp-ive01-dev.json --instance='gke-gke-app200-dev-default-pool-2a84f477-ggw8' --metric='instance/cpu/utilization' --aggregation=average --timeframe=600 --debug
UNKNOWN: Monitoring endpoint API return error code '404' (add --debug option for detailed message)
./centreon_plugins.pl --plugin=cloud::google::gcp::management::stackdriver::plugin --mode=get-metrics --custommode=api --api='compute.googleapis.com' --key-file=/root/gcp-ive01-dev.json --instance='gke-gke-app200-dev-default-pool-2a84f477-ggw8' --metric='kubernetes.io/instance/cpu/utilization' --aggregation=average --timeframe=600 --debug
UNKNOWN: Monitoring endpoint API return error code '404' (add --debug option for detailed message)
./centreon_plugins.pl --plugin=cloud::google::gcp::management::stackdriver::plugin --mode=get-metrics --custommode=api --api='container.googleapis.com' --key-file=/root/gcp-ive01-dev.json --instance='gke-gke-app200-dev-default-pool-2a84f477-ggw8' --metric='kubernetes.io/instance/cpu/utilization' --aggregation=average --timeframe=600
UNKNOWN: Monitoring endpoint API return error code '404' (add --debug option for detailed message)
./centreon_plugins.pl --plugin=cloud::google::gcp::management::stackdriver::plugin --mode=get-metrics --custommode=api --api='container.googleapis.com' --key-file=/root/gcp-ive01-dev.json --instance='gke-gke-app200-dev-default-pool-2a84f477-ggw8' --metric='instance/cpu/utilization' --aggregation=average --timeframe=600
UNKNOWN: Monitoring endpoint API return error code '404' (add --debug option for detailed message)
joschi99 commented 5 years ago

Hi @garnier-quentin , do you have any idea if monitoring of kubernetes through Stackdriver should be only a plugin configuration topic or maybe it has been do some development?

cgagnaire commented 5 years ago

Hi @joschi99, It's really difficult for us to know without a hands-on on a platform to understand what's what. So, if you're willing to let us access to yours, by providing credentials on a dev platform or by doing a remote session, we are totaly opened to it!

joschi99 commented 5 years ago

Hi @cgagnaire , no problem do to a remote session with you. Next Monday 2pm could be ok for you?

joschi99 commented 5 years ago

We can do a remote session also between august 28-30.

cgagnaire commented 5 years ago

Hi @joschi99, Sorry but I don't have much availabilities for the next 3 weeks. I'll try to see with @Sims24 or @garnier-quentin if they can connect, but surely not today, sorry. I'll come back at you.

joschi99 commented 5 years ago

Thank you very much.

Sims24 commented 5 years ago

@joschi99 thanks for your help today. We need to add a dimension option to the very generic mode to be able to query every metric from stackdriver.

sims(18:07:00):# ./centreon_plugins.pl --plugin=cloud::google::gcp::management::stackdriver::plugin --mode=get-metrics --custommode=api --api='kubernetes.io' --key-file=../key.json --instance='gke-node-name' --metric='node/cpu/allocatable_cores' --dimension='resource.labels.node_name' --proxyurl='connect://proxy.int.centreon.com:3128'
OK: Metric 'node/cpu/allocatable_cores' of resource 'gke-node-name' value is 0.94 | 'node/cpu/allocatable_cores_average'=0.94;;;;

Nevertheless, we will have to write some dedicated nodes for k8s, redis etc because using only the generic one will have some limitations (e.g using resource.labels.cluster_name as a dimension will not provide cpu allocatable metric on each nodes which are part of the cluster).

I'll push my work to a dedicated branch so you will be able to play with it quickly and provide feedbacks.

joschi99 commented 5 years ago

Hi @Sims24 thank you very much for your help. Let me know when I can test it.

Sims24 commented 5 years ago

Hi @joschi99 ,

It's ok now with the https://github.com/centreon/centreon-plugins/pull/1633 and dimension for simple k8s metric.

This week I'll work on dedicated modes to monitor most basic metric for redis and each k8s components. Hope to deliver it before the end of the week.

Thanks

joschi99 commented 5 years ago

Hi @Sims24 great. Let me know when we can do some test.

Thanks

garnier-quentin commented 3 years ago

Hi @joschi99,

You can test with following git: https://github.com/centreon/centreon-plugins/tree/gcp Try following command:

 ./centreon_plugins.pl --plugin=cloud::google::gcp::management::stackdriver::plugin --mode=get-metrics --custommode=api --api='kubernetes.io' --key-file=../key.json --dimension-name='resource.labels.node_name' --dimension-value='gke-node-name' --metric='node/cpu/allocatable_cores'
garnier-quentin commented 3 years ago

I close it. Now you can monitor it with last version.