influxdata / telegraf

Agent for collecting, processing, aggregating, and writing metrics, logs, and other arbitrary data.
https://influxdata.com/telegraf
MIT License
14.59k stars 5.56k forks source link

Procstat input with exe or pattern options are slow on Linux #7642

Closed danielnelson closed 3 years ago

danielnelson commented 4 years ago

The procstat plugin is too expensive to run on Linux. The issue becomes worse when needing to run several instances and when the system has more processes.

Relevant telegraf.conf:

[[inputs.procstat]]
  exe = ".*"

System info:

1.14.3

Steps to reproduce:

  1. run telegraf with procstat plugin

Expected behavior:

Use less system resources

Actual behavior:

Too much cpu use and memory allocations

Additional info:

$ go test -short ./plugins/inputs/procstat/... -bench=. -benchmem
goos: linux
goarch: amd64
pkg: github.com/influxdata/telegraf/plugins/inputs/procstat
BenchmarkPattern-4                    13          82058380 ns/op        22489259 B/op     236393 allocs/op
BenchmarkFullPattern-4                13          82729865 ns/op        23000728 B/op     237198 allocs/op
PASS
ok      github.com/influxdata/telegraf/plugins/inputs/procstat  3.238s
danielnelson commented 4 years ago

related https://github.com/shirou/gopsutil/issues/842

jsteenb2 commented 4 years ago

additional benchmark memory context:

flame graph ![image](https://user-images.githubusercontent.com/17263167/89929285-e675c080-dbbd-11ea-8bc1-b45be42949fd.png)
method list ![image](https://user-images.githubusercontent.com/17263167/89929202-c47c3e00-dbbd-11ea-9afc-cf6e6a586e6f.png)
call tree ![image](https://user-images.githubusercontent.com/17263167/89929609-4704fd80-dbbe-11ea-9eb4-ad308a8aaca1.png)
ssoroka commented 4 years ago

related to https://github.com/influxdata/telegraf/issues/7884

jessingrass commented 3 years ago

should be resolved by #7884