elastic / elastic-agent

Elastic Agent - single, unified way to add monitoring for logs, metrics, and other types of data to a host.
Other
19 stars 144 forks source link

Is it normal for elastic-agent to use 15-22% of CPU/memory on a K8S cluster? #1708

Open bshetti opened 2 years ago

bshetti commented 2 years ago

Upon using the Elastic agent daemon set with K8S integration on a GKE cluster, it was noticed that the elastic-agent running in kube-system would use 12-22% of CPU depending on the node, and 17% of memory. See image.

Is this normal?

Screen Shot 2022-11-10 at 2 14 00 PM

cmacknz commented 2 years ago

@gizas do you have any idea what the typical resource consumption of the agent K8S integration is?

gizas commented 2 years ago

ccing @ChrsMark as he is currently running some GKE tests.

I think those numbers are relative to the resources assigned to the container and not to the whole node. So percentage is an indication of resourcing based on the sizing you had verified in the manifest.

So just to verify the dashboards are we talking for k8s Integration prior to 1.26 correct? And no special configuration of the agent @bshetti?

ChrsMark commented 2 years ago

Hey, here is what I see for Agents running on a 3-node GKE cluster and k8s integration enabled:

AgentUsage

On the right the graphs show the usage as PCT of the defined limit (of the specific Pod) while on the left the graphs are PCTs of the total resource of the node.

iamjosh007 commented 2 years ago

Memory usage is pretty high close to 90% as we observed in 4-5 AKS clusters and it doesn't matter if we allocate half a gif or close to gig. We run latest stack in 8.5 and agents as well with k8s integration close to latest version, 1.26 or 27. ES support and consulting is aware as it's been going on for few months and suspecting this is dropping metrics for certain datasets. This is serious issue that must be addressed at the earliest.

iamjosh007 commented 2 years ago

Hey, here is what I see for Agents running on a 3-node GKE cluster and k8s integration enabled:

AgentUsage

On the right the graphs show the usage as PCT of the defined limit (of the specific Pod) while on the left the graphs are PCTs of the total resource of the node.

The viz title is same for memory usage - This needs to be corrected.

"Memory Usage as Pct of the Total Node Memory [Metrics Kubernetes]"

ChrsMark commented 2 years ago

The viz title is same for memory usage - This needs to be corrected.

"Memory Usage as Pct of the Total Node Memory [Metrics Kubernetes]"

Fix: https://github.com/elastic/integrations/pull/4671

iamjosh007 commented 2 years ago

Providing additional details on mem usage here. Here is my res spec, bumped default mem of agent from 500Mi to 1Gi and CPU as well. Any thoughts as the usage of app is very very minimal specific to services deployed. This is on AKS (Azure)

resources: limits: cpu: 400m memory: 1Gi requests: cpu: 100m memory: 200Mi

image

image