elastic / logstash

Logstash - transport and process your logs, events, or other data
https://www.elastic.co/products/logstash
Other
61 stars 3.5k forks source link

Add support for cgroup v2 #14534

Open rpasche opened 2 years ago

rpasche commented 2 years ago

Hi Elastic,

please add support to fetch and export cgroup metrics when using cgroup v2.

cgroup v2 is now mostly used by default in current Linux distributions and kernel.

Within elasticsearch, code has already been introduced to parse the cgroup v2 structure (see also https://github.com/elastic/elasticsearch/blob/main/server/src/main/java/org/elasticsearch/monitor/os/OsProbe.java#L688-L708)

Currently https://github.com/elastic/logstash/blob/main/logstash-core/lib/logstash/instrument/periodic_poller/cgroup.rb only supports parsing cgroup v1 structure. Example information of logstash, running in debug mode within k8s 1.22.4.

logstash@logstash:/opt$ cat /proc/1/environ | tr '\0' '\n' | grep LS_JAVA_OPTS
LS_JAVA_OPTS=-Dls.cgroup.cpuacct.path.override=/ -Dls.cgroup.cpu.path.override=/ -Dnetworkaddress.cache.ttl=60
logstash@logstash:/opt$ stat -fc %T /sys/fs/cgroup/
cgroup2fs
logstash@logstash:/opt$ cat /proc/1/cgroup
0::/
logstash@logstash:/opt$ cat /sys/fs/cgroup/cpu.stat
usage_usec 406966269
user_usec 345440634
system_usec 61525634
nr_periods 18442
nr_throttled 151
throttled_usec 72231939
logstash@logstash:/opt$ uname -a
Linux logstash 5.15.63-flatcar #1 SMP Mon Aug 29 18:27:27 -00 2022 x86_64 x86_64 x86_64 GNU/Linux
logstash@logstash:/opt$

And when looking into the logs of logstash, we see

{"level":"DEBUG","loggerName":"logstash.instrument.periodicpoller.cgroup","timeMillis":1663308955317,"thread":"LogStash::Runner","logEvent":{"message":"One or more required cgroup files or directories not found: /proc/self/cgroup, /sys/fs/cgroup/cpuacct, /sys/fs/cgroup/cpu"}}

That simply results in fully empty os object when accessing the Logstash API

logstash@logstash:/opt$ curl localhost:9600/_node/stats/os?pretty
{
  "host" : "logstash",
  "version" : "8.4.1",
  "http_address" : "0.0.0.0:9600",
  "id" : "c862a7cf-5c38-463d-a5e6-df2286eb38d2",
  "name" : "logstash",
  "ephemeral_id" : "37bce372-266d-4ba4-9b70-7dee4dbf1097",
  "status" : "green",
  "snapshot" : false,
  "pipeline" : {
    "workers" : 4,
    "batch_size" : 125,
    "batch_delay" : 50
  },
  "monitoring" : {
    "cluster_uuid" : "some_cluster_id"
  },
  "os" : { }
}logstash@logstash:/opt$
rbjorklin commented 1 year ago

This probably also means bumping the JDK version to 15 or greater: https://bugs.java.com/bugdatabase/view_bug?bug_id=8230305