cloudworkz / kube-eagle

A prometheus exporter created to provide a better overview of your resource allocation and utilization in a Kubernetes cluster.
MIT License
704 stars 46 forks source link

kube-eagle on aws #12

Closed soufianez75 closed 5 years ago

soufianez75 commented 5 years ago

can you tell me if kube-eagle is usable with aws eks ? regards

weeco commented 5 years ago

I haven't tested it on AWS, but I don't see why it wouldn't work on EKS too.

soufianez75 commented 5 years ago

Because After import the pod kube-eagle and import the dashboard i have false CPU and i haven't the different pod on dashboard i have CPU with name A-serie

Le ven. 8 mars 2019 18:26, Weeco notifications@github.com a écrit :

Closed #12 https://github.com/google-cloud-tools/kube-eagle/issues/12.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/google-cloud-tools/kube-eagle/issues/12#event-2190690188, or mute the thread https://github.com/notifications/unsubscribe-auth/AOY_5mSpzQH6QJTzBzpC-G1vcq926e9nks5vUp1FgaJpZM4bltNe .

weeco commented 5 years ago

Please provide more information. Are you talking about CPU usage, CPU limit, CPU request? Is it per node or per pod? What labels are wrong in prometheus?

What version of kube eagle are you working?

soufianez75 commented 5 years ago

I have had this in prometheus configuration :

my dashboard

capture d ecran 2019-03-08 a 19 08 39
weeco commented 5 years ago

Can you please check the response of the /metrics endpoint, along with the logs of Kube Eagle?

weeco commented 5 years ago

Since I haven't heard back from you, I'll close the issue. Just comment if this issue is still existent and you'd like to figure it out

soufianez75 commented 5 years ago

it's party of my /metrics

HELP eagle_node_resource_allocatable_cpu_cores Allocatable CPU cores on a specific node in Kubernetes

TYPE eagle_node_resource_allocatable_cpu_cores gauge

eagle_node_resource_allocatable_cpu_cores{node="ip-10-18-175-9.eu-west-1.compute.internal"} 2.0 eagle_node_resource_allocatable_cpu_cores{node="ip-10-18-255-21.eu-west-1.compute.internal"} 2.0 eagle_node_resource_allocatable_cpu_cores{node="ip-10-18-77-142.eu-west-1.compute.internal"} 2.0

HELP eagle_node_resource_allocatable_memory_bytes Allocatable memory bytes on a specific node in Kubernetes

TYPE eagle_node_resource_allocatable_memory_bytes gauge

eagle_node_resource_allocatable_memory_bytes{node="ip-10-18-175-9.eu-west-1.compute.internal"} 3.971022848e+09 eagle_node_resource_allocatable_memory_bytes{node="ip-10-18-255-21.eu-west-1.compute.internal"} 3.971022848e+09 eagle_node_resource_allocatable_memory_bytes{node="ip-10-18-77-142.eu-west-1.compute.internal"} 3.971022848e+09

HELP eagle_node_resource_limits_cpu_cores Total limit CPU cores of all specified pod resources on a node

TYPE eagle_node_resource_limits_cpu_cores gauge

eagle_node_resource_limits_cpu_cores{node="ip-10-18-175-9.eu-west-1.compute.internal"} 0.0 eagle_node_resource_limits_cpu_cores{node="ip-10-18-255-21.eu-west-1.compute.internal"} 0.0 eagle_node_resource_limits_cpu_cores{node="ip-10-18-77-142.eu-west-1.compute.internal"} 0.0

HELP eagle_node_resource_limits_memory_bytes Total limit of RAM bytes of all specified pod resources on a node

TYPE eagle_node_resource_limits_memory_bytes gauge

eagle_node_resource_limits_memory_bytes{node="ip-10-18-175-9.eu-west-1.compute.internal"} 3.5651584e+08 eagle_node_resource_limits_memory_bytes{node="ip-10-18-255-21.eu-west-1.compute.internal"} 0.0 eagle_node_resource_limits_memory_bytes{node="ip-10-18-77-142.eu-west-1.compute.internal"} 0.0

HELP eagle_node_resource_requests_cpu_cores Total request of CPU cores of all specified pod resources on a node

TYPE eagle_node_resource_requests_cpu_cores gauge

eagle_node_resource_requests_cpu_cores{node="ip-10-18-175-9.eu-west-1.compute.internal"} 0.31000000000000005 eagle_node_resource_requests_cpu_cores{node="ip-10-18-255-21.eu-west-1.compute.internal"} 0.11 eagle_node_resource_requests_cpu_cores{node="ip-10-18-77-142.eu-west-1.compute.internal"} 0.11

HELP eagle_node_resource_requests_memory_bytes Total request of RAM bytes of all specified pod resources on a node

TYPE eagle_node_resource_requests_memory_bytes gauge

eagle_node_resource_requests_memory_bytes{node="ip-10-18-175-9.eu-west-1.compute.internal"} 1.4680064e+08 eagle_node_resource_requests_memory_bytes{node="ip-10-18-255-21.eu-west-1.compute.internal"} 0.0 eagle_node_resource_requests_memory_bytes{node="ip-10-18-77-142.eu-west-1.compute.internal"} 0.0

HELP eagle_node_resource_usage_cpu_cores Total number of used CPU cores on a node

TYPE eagle_node_resource_usage_cpu_cores gauge

eagle_node_resource_usage_cpu_cores{node="ip-10-18-175-9.eu-west-1.compute.internal"} 0.056 eagle_node_resource_usage_cpu_cores{node="ip-10-18-255-21.eu-west-1.compute.internal"} 0.07 eagle_node_resource_usage_cpu_cores{node="ip-10-18-77-142.eu-west-1.compute.internal"} 0.091

HELP eagle_node_resource_usage_memory_bytes Total number of RAM bytes used on a node

TYPE eagle_node_resource_usage_memory_bytes gauge

eagle_node_resource_usage_memory_bytes{node="ip-10-18-175-9.eu-west-1.compute.internal"} 1.071296512e+09 eagle_node_resource_usage_memory_bytes{node="ip-10-18-255-21.eu-west-1.compute.internal"} 1.108758528e+09 eagle_node_resource_usage_memory_bytes{node="ip-10-18-77-142.eu-west-1.compute.internal"} 1.265700864e+09

HELP eagle_pod_container_resource_limits_cpu_cores The container's CPU limit in Kubernetes

TYPE eagle_pod_container_resource_limits_cpu_cores gauge

eagle_pod_container_resource_limits_cpu_cores{container="apigateway-neobank-service-nginx",namespace="apigateway-neobank-service",node="ip-10-18-255-21.eu-west-1.compute.internal",phase="Running",pod="apigateway-neobank-service-nginx-deployment-67c5b76658-rrbfc",qos="BestEffort"} 0.0 eagle_pod_container_resource_limits_cpu_cores{container="apigateway-neobank-service-php-fpm",namespace="apigateway-neobank-service",node="ip-10-18-77-142.eu-west-1.compute.internal",phase="Running",pod="apigateway-neobank-service-php-fpm-deployment-b6bcf8c5-vqrn4",qos="BestEffort"} 0.0 eagle_pod_container_resource_limits_cpu_cores{container="authentication-service-nginx",namespace="authentication-service",node="ip-10-18-175-9.eu-west-1.compute.internal",phase="Running",pod="authentication-service-nginx-deployment-75dbbf465c-6d8zp",qos="BestEffort"} 0.0 eagle_pod_container_resource_limits_cpu_cores{container="authentication-service-php-fpm",namespace="authentication-service",node="ip-10-18-77-142.eu-west-1.compute.internal",phase="Running",pod="authentication-service-php-fpm-deployment-86b46c5c9c-x5wxs",qos="BestEffort"} 0.0 eagle_pod_container_resource_limits_cpu_cores{container="aws-node",namespace="kube-system",node="ip-10-18-175-9.eu-west-1.compute.internal",phase="Running",pod="aws-node-64gd2",qos="Burstable"} 0.0 eagle_pod_container_resource_limits_cpu_cores{container="aws-node",namespace="kube-system",node="ip-10-18-255-21.eu-west-1.compute.internal",phase="Running",pod="aws-node-5whdq",qos="Burstable"} 0.0 eagle_pod_container_resource_limits_cpu_cores{container="aws-node",namespace="kube-system",node="ip-10-18-77-142.eu-west-1.compute.internal",phase="Running",pod="aws-node-pxrb5",qos="Burstable"} 0.0 eagle_pod_container_resource_limits_cpu_cores{container="coredns",namespace="kube-system",node="ip-10-18-175-9.eu-west-1.compute.internal",phase="Running",pod="coredns-7554568866-dkdz6",qos="Burstable"} 0.0 eagle_pod_container_resource_limits_cpu_cores{container="coredns",namespace="kube-system",node="ip-10-18-175-9.eu-west-1.compute.internal",phase="Running",pod="coredns-7554568866-ntbnj",qos="Burstable"} 0.0

weeco commented 5 years ago

Since the metrics output seems to be ok, this doesn't look like an issue with Kube Eagle and/or AWS. Instead it might be an issue with your Prometheus (Job Configuration). It should be easy to debug by just looking at your prometheus metrics and the Grafana dashboard queries.

sc250024 commented 5 years ago

I'm having the exact same problem; I'm on AWS, but the only different is that I'm using Kops, so I don't think this is an EKS specific issue. @soufianez75 Are you running Prometheus operator? If so, what version?

sc250024 commented 5 years ago

@weeco Also, can we open this back up please? I would like to see what the problem is so that we can diagnose this.

weeco commented 5 years ago

@sc250024 I am not sure if I can help at all if the response on the /metrics endpoint is correct?