elastic / beats

:tropical_fish: Beats - Lightweight shippers for Elasticsearch & Logstash
https://www.elastic.co/products/beats
Other
12.16k stars 4.91k forks source link

We are not seeing metrics for cpu, memory and network #31124

Open SonalJain1707 opened 2 years ago

SonalJain1707 commented 2 years ago

Hi,

We are trying to collect metrics from cloud converge but we are not able to see metrics for cpu , memory and network

**kubernetes.node.network.rx.bytes 0  
kubernetes.node.network.rx.errors 0  
kubernetes.node.network.tx.bytes 0  
kubernetes.node.network.tx.errors 0 **  
kubernetes.node.runtime.imagefs.available.bytes 21,541,470,208  
kubernetes.node.runtime.imagefs.capacity.bytes 64,100,753,408  
kubernetes.node.runtime.imagefs.used.bytes 65,713,395,269
kubernetes.node.fs.used.bytes 39,830,368,256  
kubernetes.node.memory.available.bytes 67,426,185,216  
kubernetes.node.memory.majorpagefaults 0  
kubernetes.node.memory.pagefaults 0  
kubernetes.node.memory.rss.bytes 0  
kubernetes.node.memory.usage.bytes 0  
kubernetes.node.memory.workingset.bytes 0
SonalJain1707 commented 2 years ago

Team:SAP

botelastic[bot] commented 2 years ago

Thank you very much for creating this issue. However, we would kindly like to ask you to post all questions and issues on the Discuss forum first. In addition to awesome, knowledgeable community contributors, core Beats developers are on the forums every single day to help you out as well. So, your questions will reach a wider audience there, and if we confirm that there is a bug, then you can reopen this issue with the new information or open a new one.

sebglon commented 1 year ago

Hi, we have the same issue on metricbeat 8.4.2. K8S: 1.22.2 Elastic cluster: v8.4.2

Here are my clusterRoles:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: metricbeat
  labels:
    k8s-app: metricbeat
rules:
  - apiGroups: [""]
    resources:
      - nodes
      - namespaces
      - events
      - pods
      - services
    verbs: ["get", "list", "watch"]
  # Enable this rule only if planing to use Kubernetes keystore
  #- apiGroups: [""]
  #  resources:
  #  - secrets
  #  verbs: ["get"]
  - apiGroups: ["extensions"]
    resources:
      - replicasets
    verbs: ["get", "list", "watch"]
  - apiGroups: ["apps"]
    resources:
      - statefulsets
      - deployments
      - replicasets
    verbs: ["get", "list", "watch"]
  - apiGroups: ["batch"]
    resources:
      - jobs
      - cronjobs
    verbs: ["get", "list", "watch"]
  - apiGroups:
      - ""
    resources:
      - nodes/stats
    verbs:
      - get
  - nonResourceURLs:
      - "/metrics"
    verbs:
      - get
sebglon commented 1 year ago

If i run this cmd under the metricbeat daemonset pod: curl -k https://${NODE_NAME}:10250/stats/summary --header "Authorization: Bearer $TOKEN" I have my pod memory usage:

 {
   "podRef": {
    "name": "my-pod",
    "namespace": "default",
    "uid": "7be45b82-0395-4ae9-a0ab-7c275c121565"
   },
   "startTime": "2022-10-27T15:42:12Z",
   "containers": [
    {
     "name": "postgres",
     "startTime": "2022-10-27T15:42:15Z",
     "cpu": {
      "time": "2022-10-27T16:02:59Z",
      "usageNanoCores": 3272512,
      "usageCoreNanoSeconds": 19228071000
     },
     "memory": {
      "time": "2022-10-27T16:02:59Z",
      "workingSetBytes": 148652032
     },
     "rootfs": {
      "time": "2022-10-27T16:02:52Z",
      "availableBytes": 110666403840,
      "capacityBytes": 135102586880,
      "usedBytes": 196608,
      "inodesFree": 8147061,
      "inodes": 8380416,
      "inodesUsed": 60
     },
     "logs": {
      "time": "2022-10-27T16:02:59Z",
      "availableBytes": 110666403840,
      "capacityBytes": 135102586880,
      "usedBytes": 45056,
      "inodesFree": 8147061,
      "inodes": 8380416,
      "inodesUsed": 1
     }
    }
   ],
   "cpu": {
    "time": "2022-10-27T16:02:58Z",
    "usageNanoCores": 3221979,
    "usageCoreNanoSeconds": 19417139000
   },
   "memory": {
    "time": "2022-10-27T16:02:58Z",
    "availableBytes": 7367151616,
    "usageBytes": 190943232,
    "workingSetBytes": 149041152,
    "rssBytes": 41623552,
    "pageFaults": 803187,
    "majorPageFaults": 0
   },
sebglon commented 1 year ago

After small investigation container memory and CPU usage are removed from kubelet here

But on metricbeat, we calculate the pod memory usage based on the container memory usage here. That's an issue for me to calculate the pod.memory.usageBytes with a non existing metrics. we have to reuse the provided values from kubelet

sebglon commented 1 year ago

Hi, For me the dataset available here is not good: https://github.com/elastic/beats/blob/33df99af2f7eb5757dac6e849e133c1370303f26/metricbeat/module/kubernetes/_meta/test/stats_summary_multiple_containers.json#L85

it not match the kbelet data: https://github.com/kubernetes/kubernetes/blob/release-1.22/pkg/kubelet/metrics/collectors/resource_metrics.go

sebglon commented 1 year ago

Here is an exampe Kubelet call on K8S 1.17:

  {
   "podRef": {
    "name": "filebeat-5bd7l",
    "namespace": "beats",
    "uid": "2410a8e2-8732-45d0-a168-4066147bcc4c"
   },
   "startTime": "2022-09-28T09:45:25Z",
   "containers": [
    {
     "name": "filebeat",
     "startTime": "2022-09-28T09:45:35Z",
     "cpu": {
      "time": "2022-10-28T07:57:56Z",
      "usageNanoCores": 11373776,
      "usageCoreNanoSeconds": 24321372471944
     },
     "memory": {
      "time": "2022-10-28T07:57:56Z",
      "availableBytes": 177180672,
      "usageBytes": 95526912,
      "workingSetBytes": 84963328,
      "rssBytes": 80998400,
      "pageFaults": 19847091,
      "majorPageFaults": 0
     },
     "rootfs": {
      "time": "2022-10-28T07:57:56Z",
      "availableBytes": 78362660864,
      "capacityBytes": 135247413248,
      "usedBytes": 53248,
      "inodesFree": 7446403,
      "inodes": 8388608,
      "inodesUsed": 13
     },
     "logs": {
      "time": "2022-10-28T07:57:56Z",
      "availableBytes": 78362660864,
      "capacityBytes": 135247413248,
      "usedBytes": 151998464,
      "inodesFree": 7446403,
      "inodes": 8388608,
      "inodesUsed": 942205
     }
    }
   ],
   "cpu": {
    "time": "2022-10-28T07:57:53Z",
    "usageNanoCores": 13635645,
    "usageCoreNanoSeconds": 24321593387534
   },
   "memory": {
    "time": "2022-10-28T07:57:53Z",
    "availableBytes": 175927296,
    "usageBytes": 96772096,
    "workingSetBytes": 86216704,
    "rssBytes": 80998400,
    "pageFaults": 0,
    "majorPageFaults": 0
   },
   "network": {
    "time": "2022-10-28T07:57:56Z",
    "name": "eth0",
    "rxBytes": 23609641833189,
    "rxErrors": 0,
    "txBytes": 20488187806525,
    "txErrors": 0,
    "interfaces": [
     {
      "name": "cali931f1a1ec64",
      "rxBytes": 10908725315,
      "rxErrors": 0,
      "txBytes": 67507172815,
      "txErrors": 0
     },
     {
      "name": "eth0",
      "rxBytes": 23609641833189,
      "rxErrors": 0,
      "txBytes": 20488187806525,
      "txErrors": 0
     },
     {
      "name": "calif160940b4c5",
      "rxBytes": 0,
      "rxErrors": 0,
      "txBytes": 1420536,
      "txErrors": 0
     },
     {
      "name": "tunl0",
      "rxBytes": 18158882920140,
      "rxErrors": 0,
      "txBytes": 17254991431206,
      "txErrors": 0
     },
     {
      "name": "kube-ipvs0",
      "rxBytes": 0,
      "rxErrors": 0,
      "txBytes": 0,
      "txErrors": 0
     },
     {
      "name": "nodelocaldns",
      "rxBytes": 0,
      "rxErrors": 0,
      "txBytes": 0,
      "txErrors": 0
     }
    ]
   },
   "volume": [
    {
     "time": "2022-10-27T12:44:03Z",
     "availableBytes": 78390005760,
     "capacityBytes": 135247413248,
     "usedBytes": 12288,
     "inodesFree": 7446592,
     "inodes": 8388608,
     "inodesUsed": 5,
     "name": "config"
    },
    {
     "time": "2022-10-27T12:44:03Z",
     "availableBytes": 32193466368,
     "capacityBytes": 32193474560,
     "usedBytes": 8192,
     "inodesFree": 7859728,
     "inodes": 7859735,
     "inodesUsed": 7,
     "name": "vault-tls-secrets"
    },
    {
     "time": "2022-10-28T07:57:30Z",
     "availableBytes": 32193462272,
     "capacityBytes": 32193474560,
     "usedBytes": 12288,
     "inodesFree": 7859731,
     "inodes": 7859735,
     "inodesUsed": 4,
     "name": "vault-secrets"
    },
    {
     "time": "2022-10-27T12:44:03Z",
     "availableBytes": 32193462272,
     "capacityBytes": 32193474560,
     "usedBytes": 12288,
     "inodesFree": 7859726,
     "inodes": 7859735,
     "inodesUsed": 9,
     "name": "filebeat-token-jml4h"
    },
    {
     "time": "2022-10-28T07:57:30Z",
     "availableBytes": 32193466368,
     "capacityBytes": 32193474560,
     "usedBytes": 8192,
     "inodesFree": 7859732,
     "inodes": 7859735,
     "inodesUsed": 3,
     "name": "home-init"
    }
   ],
   "ephemeral-storage": {
    "time": "2022-10-28T07:58:04Z",
    "availableBytes": 78362660864,
    "capacityBytes": 135247413248,
    "usedBytes": 152064000,
    "inodesFree": 7446403,
    "inodes": 8388608,
    "inodesUsed": 18
   }
  },
sebglon commented 1 year ago

For me the issue is on kubelet 1.22.2 and seems fixed on 1.22.9. it's because we use systemd as cgroupfs driver (https://github.com/kubernetes/kubernetes/issues/103366#issuecomment-885430574)

fmiqbal commented 1 year ago

stumble upon this issue, I dunno if mine is also related,

using microk8s v1.27, ES 8.7 self-managed, kubernetes integration on fleet agent

memory bytes is present on kubelet,

``` { "podRef": { "name": "dashboard-metrics-scraper-5cb4f4bb9c-svjgj", "namespace": "kube-system", "uid": "38ee6618-3d64-4ec8-809b-ed95da8cf9d8" }, "startTime": "2023-05-11T09:48:59Z", "containers": [ { "name": "dashboard-metrics-scraper", "startTime": "2023-05-11T09:49:15Z", "cpu": { "time": "2023-05-13T15:29:53Z", "usageNanoCores": 128631, "usageCoreNanoSeconds": 40989303000 }, "memory": { "time": "2023-05-13T15:29:53Z", "workingSetBytes": 26316800 }, "rootfs": { "time": "2023-05-13T15:29:45Z", "availableBytes": 64923242496, "capacityBytes": 101148950528, "usedBytes": 45056, "inodesFree": 5572844, "inodes": 6316032, "inodesUsed": 14 }, "logs": { "time": "2023-05-13T15:29:53Z", "availableBytes": 64923242496, "capacityBytes": 101148950528, "usedBytes": 3956736, "inodesFree": 5572844, "inodes": 6316032, "inodesUsed": 1 } } ], "cpu": { "time": "2023-05-13T15:29:50Z", "usageNanoCores": 142556, "usageCoreNanoSeconds": 41029901000 }, "memory": { "time": "2023-05-13T15:29:50Z", "usageBytes": 26722304, "workingSetBytes": 26587136, "rssBytes": 23400448, "pageFaults": 50292, "majorPageFaults": 0 }, "volume": [ { "time": "2023-05-13T15:29:51Z", "availableBytes": 64923238400, "capacityBytes": 101148950528, "usedBytes": 102400, "inodesFree": 5572844, "inodes": 6316032, "inodesUsed": 2, "name": "tmp-volume" }, { "time": "2023-05-13T15:29:51Z", "availableBytes": 33564921856, "capacityBytes": 33564934144, "usedBytes": 12288, "inodesFree": 4110073, "inodes": 4110082, "inodesUsed": 9, "name": "kube-api-access-bcdj5" } ], "ephemeral-storage": { "time": "2023-05-13T15:29:53Z", "availableBytes": 64923242496, "capacityBytes": 101148950528, "usedBytes": 4108288, "inodesFree": 5572844, "inodes": 6316032, "inodesUsed": 18 }, "process_stats": { "process_count": 0 } } ```

But missing on document,

``` { "_index": ".ds-metrics-kubernetes.container-default-2023.05.13-000003", "_id": "ZAi8FYgBDrY9dlLUy2-L", "_version": 1, "_score": 0, "_source": { "container": { "memory": { "usage": 0 }, "name": "dashboard-metrics-scraper", "runtime": "containerd", "cpu": { "usage": 0.00000977875 }, "id": "76fec8d77a6dc0b57124a01363aa8649a9242333050716fa7738e566273b57d4" }, "kubernetes": { "container": { "start_time": "2023-05-11T09:49:15Z", "memory": { "rss": { "bytes": 0 }, "usage": { "node": { "pct": 0 }, "bytes": 0, "limit": { "pct": 0 } }, "majorpagefaults": 0, "available": { "bytes": 0 }, "workingset": { "bytes": 26390528, "limit": { "pct": 0.0007838043133932608 } }, "pagefaults": 0 }, "rootfs": { "inodes": { "used": 14 }, "available": { "bytes": 64919298048 }, "used": { "bytes": 45056 }, "capacity": { "bytes": 101148950528 } }, "name": "dashboard-metrics-scraper", "cpu": { "usage": { "node": { "pct": 0.00000977875 }, "core": { "ns": 41002195000 }, "limit": { "pct": 0.00000977875 }, "nanocores": 156460 } }, "logs": { "inodes": { "count": 6316032, "used": 1, "free": 5572843 }, "available": { "bytes": 64919298048 }, "used": { "bytes": 3960832 }, "capacity": { "bytes": 101148950528 } } }, "node": { --- REDACTED --- }, "pod": { "uid": "38ee6618-3d64-4ec8-809b-ed95da8cf9d8", "ip": "10.1.128.246", "name": "dashboard-metrics-scraper-5cb4f4bb9c-svjgj" }, "namespace": "kube-system", "namespace_uid": "a26a48cb-568a-4ebd-8af6-877293571d1b", "replicaset": { "name": "dashboard-metrics-scraper-5cb4f4bb9c" }, "namespace_labels": { "kubernetes_io/metadata_name": "kube-system" }, "deployment": { "name": "dashboard-metrics-scraper" }, "labels": { "pod-template-hash": "5cb4f4bb9c", "k8s-app": "dashboard-metrics-scraper" } }, "agent": { --- REDACTED --- },, "@timestamp": "2023-05-13T15:31:13.430Z", "ecs": { "version": "8.0.0" }, "data_stream": { "namespace": "default", "type": "metrics", "dataset": "kubernetes.container" }, "service": { "address": { --- REDACTED --- }, "type": "kubernetes" }, "elastic_agent": { "id": "5b60defe-fa9b-4496-bba9-d002e5e3a5d0", "version": "8.7.1", "snapshot": false }, "host": { --- REDACTED --- }, "metricset": { "period": 10000, "name": "container" }, "event": { "duration": 72459532, "agent_id_status": "verified", "ingested": "2023-05-13T15:31:14Z", "module": "kubernetes", "dataset": "kubernetes.container" } }, "fields": { "kubernetes.node.uid": [ "b3c11f82-f3a1-4b77-9a97-fc3e59b1e5f8" ], "elastic_agent.version": [ "8.7.1" ], "kubernetes.namespace_uid": [ "a26a48cb-568a-4ebd-8af6-877293571d1b" ], "host.os.name.text": [ "Ubuntu" ], "kubernetes.deployment.name": [ "dashboard-metrics-scraper" ], "kubernetes.container.logs.inodes.used": [ 1 ], "host.hostname": { --- REDACTED --- },, "host.mac": [ "00-50-56-BF-68-F3", "00-50-56-BF-E5-6D", "66-3D-EB-3E-8B-EC", "EE-EE-EE-EE-EE-EE" ], "kubernetes.node.labels.kubernetes_io/os": [ "linux" ], "container.id": [ "76fec8d77a6dc0b57124a01363aa8649a9242333050716fa7738e566273b57d4" ], "kubernetes.labels.pod-template-hash": [ "5cb4f4bb9c" ], "service.type": [ "kubernetes" ], "kubernetes.container.cpu.usage.core.ns": [ 41002195000 ], "container.name": [ "dashboard-metrics-scraper" ], "container.memory.usage": [ 0 ], "host.os.version": [ "20.04.6 LTS (Focal Fossa)" ], "kubernetes.namespace": [ "kube-system" ], "kubernetes.node.labels.beta_kubernetes_io/os": [ "linux" ], "kubernetes.container.memory.usage.limit.pct": [ 0 ], "host.os.name": [ "Ubuntu" ], "agent.name": { --- REDACTED --- }, "host.name": { --- REDACTED --- }, "kubernetes.labels.k8s-app": [ "dashboard-metrics-scraper" ], "event.agent_id_status": [ "verified" ], "kubernetes.container.memory.usage.node.pct": [ 0 ], "host.os.type": [ "linux" ], "kubernetes.container.memory.available.bytes": [ 0 ], "kubernetes.container.logs.available.bytes": [ 64919298048 ], "kubernetes.container.logs.inodes.count": [ 6316032 ], "kubernetes.container.cpu.usage.node.pct": [ 0 ], "kubernetes.container.memory.pagefaults": [ 0 ], "data_stream.type": [ "metrics" ], "kubernetes.node.labels.node_kubernetes_io/microk8s-controlplane": [ "microk8s-controlplane" ], "host.architecture": [ "x86_64" ], "kubernetes.container.cpu.usage.limit.pct": [ 0 ], "container.runtime": [ "containerd" ], "agent.id": [ "5b60defe-fa9b-4496-bba9-d002e5e3a5d0" ], "host.containerized": [ false ], "container.cpu.usage": [ 0 ], "ecs.version": [ "8.0.0" ], "service.address": { --- REDACTED --- },, "kubernetes.container.rootfs.available.bytes": [ 64919298048 ], "kubernetes.container.memory.rss.bytes": [ 0 ], "agent.version": [ "8.7.1" ], "kubernetes.container.memory.workingset.bytes": [ 26390528 ], "host.os.family": [ "debian" ], "kubernetes.node.name": { --- REDACTED --- },, "kubernetes.container.logs.inodes.free": [ 5572843 ], "kubernetes.node.hostname": { --- REDACTED --- }, "kubernetes.pod.uid": [ "38ee6618-3d64-4ec8-809b-ed95da8cf9d8" ], "kubernetes.container.memory.usage.bytes": [ 0 ], "kubernetes.container.logs.used.bytes": [ 3960832 ], "kubernetes.container.rootfs.used.bytes": [ 45056 ], "kubernetes.container.start_time": [ "2023-05-11T09:49:15.000Z" ], "kubernetes.container.cpu.usage.nanocores": [ 156460 ], "host.ip": [ "10.9.247.13", "fe80::250:56ff:febf:e56d", "10.1.128.192", "fe80::643d:ebff:fe3e:8bec", "fe80::ecee:eeff:feee:eeee", "fe80::ecee:eeff:feee:eeee", "fe80::ecee:eeff:feee:eeee", "fe80::ecee:eeff:feee:eeee", "fe80::ecee:eeff:feee:eeee", "fe80::ecee:eeff:feee:eeee", "fe80::ecee:eeff:feee:eeee", "fe80::ecee:eeff:feee:eeee", "fe80::ecee:eeff:feee:eeee", "fe80::ecee:eeff:feee:eeee", "fe80::ecee:eeff:feee:eeee", "fe80::ecee:eeff:feee:eeee", "fe80::ecee:eeff:feee:eeee", "fe80::ecee:eeff:feee:eeee", "fe80::ecee:eeff:feee:eeee", "fe80::ecee:eeff:feee:eeee", "fe80::ecee:eeff:feee:eeee", "fe80::ecee:eeff:feee:eeee", "fe80::ecee:eeff:feee:eeee", "fe80::ecee:eeff:feee:eeee", "fe80::ecee:eeff:feee:eeee", "fe80::ecee:eeff:feee:eeee", "fe80::ecee:eeff:feee:eeee" ], "agent.type": [ "metricbeat" ], "event.module": [ "kubernetes" ], "host.os.kernel": [ "5.10.0-21-amd64" ], "kubernetes.container.rootfs.inodes.used": [ 14 ], "kubernetes.pod.name": [ "dashboard-metrics-scraper-5cb4f4bb9c-svjgj" ], "elastic_agent.snapshot": [ false ], "host.id": [ "846f0d3601044e94b0c1a4208da1b560" ], "kubernetes.pod.ip": [ "10.1.128.246" ], "kubernetes.container.memory.majorpagefaults": [ 0 ], "kubernetes.container.name": [ "dashboard-metrics-scraper" ], "elastic_agent.id": [ "5b60defe-fa9b-4496-bba9-d002e5e3a5d0" ], "data_stream.namespace": [ "default" ], "kubernetes.replicaset.name": [ "dashboard-metrics-scraper-5cb4f4bb9c" ], "metricset.period": [ 10000 ], "host.os.codename": [ "focal" ], "kubernetes.container.rootfs.capacity.bytes": [ 101148950528 ], "kubernetes.namespace_labels.kubernetes_io/metadata_name": [ "kube-system" ], "kubernetes.node.labels.kubernetes_io/hostname": { --- REDACTED --- },, "kubernetes.container.memory.workingset.limit.pct": [ 0.001 ], "metricset.name": [ "container" ], "event.duration": [ 72459532 ], "kubernetes.node.labels.microk8s_io/cluster": [ "true" ], "kubernetes.node.labels.beta_kubernetes_io/arch": [ "amd64" ], "event.ingested": [ "2023-05-13T15:31:14.000Z" ], "kubernetes.container.logs.capacity.bytes": [ 101148950528 ], "@timestamp": [ "2023-05-13T15:31:13.430Z" ], "host.os.platform": [ "ubuntu" ], "data_stream.dataset": [ "kubernetes.container" ], "kubernetes.node.labels.kubernetes_io/arch": [ "amd64" ], "agent.ephemeral_id": [ "0b142839-4f6a-44a4-bd76-8eb36cbcd63c" ], "event.dataset": [ "kubernetes.container" ] } } ```

From dashboard persepective, only overview -> node memory, pods -> network rx/tx, nodes -> network rx/tx, that missing

image image

botelastic[bot] commented 5 months ago

Hi! We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1. Thank you for your contribution!