Open xxxserxxx opened 4 years ago
@xxxserxxx could you also try go-sysinfo? I've been wondering but not much time to try it atm.
Yeah! Thanks; this gives me some impetus to build out proper benchmarks for devices.
Np, please feel free to share the results, it would be very helpful to me -and anyone needing a process data library in Go- as well.
I may guess that ps
is more efficient than gopsutil because it knows before-hand which information to gather from /proc/$PID/status
in a single file read while gopsutil first creates a Process variable (also checks for process existence as we can create this type for an arbitrary PID number, could you maybe benchmark before this change https://github.com/shirou/gopsutil/pull/821?) and then re-reads /proc/$PID/status
file for each new information asked (see how many times fillFromStatusWithContext is called).
In fact, we already have the following issues that aim to optimize this: https://github.com/shirou/gopsutil/issues/286, https://github.com/shirou/gopsutil/issues/413.
I wrote a benchmark, and am attaching it here; again, github has a restricted set of suffixes, so resuffix the file to .go
first.
The attached file is a proper benchmark containing all three implementations: forking, gopsutil
, and go-sysinfo
. Each function performs similar operations, getting some data about the processes. Optimizing-out is addressed by counting errors in the return values.
➜ ps go test -benchtime=20s -bench .
goos: linux
goarch: amd64
pkg: ser1.net/perf
BenchmarkForking-8 2437 9663078 ns/op
--- BENCH: BenchmarkForking-8
perf_test.go:25: forking errors: 0
perf_test.go:25: forking errors: 0
perf_test.go:25: forking errors: 0
BenchmarkGoPSUtil-8 1162 21010525 ns/op
--- BENCH: BenchmarkGoPSUtil-8
perf_test.go:32: GoPsUtil errors: 0
perf_test.go:32: GoPsUtil errors: 2
perf_test.go:32: GoPsUtil errors: 11
BenchmarkGoSysInfo-8 2671 8590463 ns/op
--- BENCH: BenchmarkGoSysInfo-8
perf_test.go:39: GoSysInfo errors: 0
perf_test.go:39: GoSysInfo errors: 1
perf_test.go:39: GoSysInfo errors: 27
PASS
ok ser1.net/perf 74.941s
Of note is that the forking version is the only one that doesn't generate errors when processing data. What I saw of the errors, they appear to be related to processes disappearing while the libraries are gathering information about the process. go-sysinfo
consistently generated more errors than gopsutil
, across all runs.
I'm see vaguely conflicting results. According to the benchmark, go-sysutils
is, indeed, faster than forking. However, in an application the CPU use of go-sysutils
is double that of forking. I said "vaguely" because this is easily explained: forking is more IO bound (I'm guessing), which takes longer with less CPU impact than a pure syscall interface. However, given that the application isn't in a tight loop (it loops once per second), I wouldn't expect to notice the difference, and top
reports higher CPU use for go-sysinfo
when both are running side-by-side.
How about I go ahead and attach that benchmark, huh?
I meant to @AtakanColak. Also, @Lomanic -- what did you mean, "benchmark before this change?" Do you want me to fix the version at the commit prior to #821 and benchmark that?
The commit immediately prior to the merge was v2.20.1, which made things easy. I pinned the version in go.mod
to that and re-ran the benchmark. It's not perfect, because I running them side-by-side in the same test would be more effort, but here are the results:
##### v2.20.1
➜ ps go test -benchtime=20s -bench .
go: downloading github.com/shirou/gopsutil v2.20.1+incompatible
goos: linux
goarch: amd64
pkg: ser1.net/perf
BenchmarkForking-8 2430 9777497 ns/op
--- BENCH: BenchmarkForking-8
perf_test.go:25: forking errors: 0
perf_test.go:25: forking errors: 0
perf_test.go:25: forking errors: 0
BenchmarkGoPSUtil-8 1186 21953451 ns/op
--- BENCH: BenchmarkGoPSUtil-8
perf_test.go:32: GoPsUtil errors: 0
perf_test.go:32: GoPsUtil errors: 1
perf_test.go:32: GoPsUtil errors: 16
BenchmarkGoSysInfo-8 2703 8751875 ns/op
--- BENCH: BenchmarkGoSysInfo-8
perf_test.go:39: GoSysInfo errors: 0
perf_test.go:39: GoSysInfo errors: 0
perf_test.go:39: GoSysInfo errors: 28
PASS
ok ser1.net/perf 77.424s
The difference is within an easy margin of error, so I ran it again with just the gopsutil
benchmark, switching the go.mod
version in between:
##### v2.20.1
➜ ps go test -benchtime=20s -bench=BenchmarkGoPSUtil .
goos: linux
goarch: amd64
pkg: ser1.net/perf
BenchmarkGoPSUtil-8 1164 20295012 ns/op
--- BENCH: BenchmarkGoPSUtil-8
perf_test.go:32: GoPsUtil errors: 0
perf_test.go:32: GoPsUtil errors: 0
perf_test.go:32: GoPsUtil errors: 7
PASS
ok ser1.net/perf 25.732s
##### v2.20.2
➜ ps go test -benchtime=20s -bench=BenchmarkGoPSUtil .
goos: linux
goarch: amd64
pkg: ser1.net/perf
BenchmarkGoPSUtil-8 1131 21108605 ns/op
--- BENCH: BenchmarkGoPSUtil-8
perf_test.go:32: GoPsUtil errors: 0
perf_test.go:32: GoPsUtil errors: 0
perf_test.go:32: GoPsUtil errors: 13
PASS
ok ser1.net/perf 26.044s
Another hiccup with go-sysinfo
: it isn't making syscalls; it's looking directly in /proc
. This means it's unusable for (e.g.) gotop
, because unless it's running as root it can't get information about many of the processes running on the system. ps
can, so forking that and parsing the results can provide a system overview to non-root users that go-sysinfo
can't.
Now I'm wondering how gopsutil
gets it's info 🤔
@xxxserxxx First of all thank you for your beautiful work. It has also given me few ideas. However, do let me know if I'm wrong, when you execute a exec.Cmd
, it doesn't fork in C sense, it spawns another process. Unless you have been monitoring the CPU/Mem usage of that process as well, simply observing the CPU usage of gotop
will yield inaccurate results; work is given to the ps
process while gotop
is parsing the results whereas parsing is extra work.
To my understanding gopsutil
also looks to /proc
. However I haven't come across a case where I couldn't get information about processes even though I didn't run it as root. Perhaps I simply didn't realize it..
EDIT: I have taken some look into how ps
works and it supposedly also reads from /proc
. I don't understand the difference now :(
You're right: I'm not tracking the ps
process CPU usage at all. It enters and exits far too quickly, and I'm not able to monitor total average CPU usage granular enough to notice if it's overall higher or lower either way.
Yeah, I don't know about ps
; unless it's running in some sort of limited elevated privilege mode, I can't think of why it works. It may be because it's pulling limited, permitted data from /proc
-- just general CPU and memory utilization -- whereas the libs are trying to gather more restricted information like environment. The former is not (necessarily) sensitive; accessing environment variables of running processes is. But I'm totally guessing.
I do know that some errors are because processes are disappearing between the time the library is pulling a list of procs and reading data on the procs. So the handles are literally disappearing, like if you read a directory glob and something deleted files before you processed each file. Again, I haven't dug into internals, but this could be mitigated by recognizing that the handle has disappeared and just swallowing it rather than returning errors on it. I believe go-sysinfo
is particularly susceptible to this because the list and stat pulls are entirely different calls by the client using the API, rather than a single call that pulls and returns everything.
That said, I'm less worried about the disappearing handles. That can be easily recognized and handled in client code. I'm more concerned about the permission errors I see. Also, I only know the permission issue is happening with go-sysinfo;
I haven't looked for or observed that error in gopsutil
yet.
what did you mean, "benchmark before this change?" Do you want me to fix the version at the commit prior to #821 and benchmark that?
Exactly, just out of curiosity, I was not expecting a spectacular difference. Sorry for this noise.
I believe go-sysinfo is particularly susceptible to this because the list and stat pulls are entirely different calls by the client using the API, rather than a single call that pulls and returns everything.
That's also the case for gopsutil, see my previous comment.
ps, go-sysinfo and gopsutil all get their data about processes from procfs on Linux. There is no special syscalls involved. Hence they should report the same errors. Maybe gopsutil and go-sysinfo don't use the exact same files to get the same data, files with different permissions then.
About the core issue reported (comparatively high CPU usage compared to ps), see my previous message. Determining how go-sysinfo is able to get this data more efficiently than gopsutil would need some profiling to check what both libraries do differently, this benchmark provides an interesting data point, thanks for this.
I've been reading back through the thread, and I just realized that it doesn't matter whether we're counting the ps
process or not. The benchmarks are measuring the number of call to gather information that can be made per run; that's an absolute number. Also, a persistent 2% CPU average use, vs a 0.5s CPU use with a once-per-second possible spike because of ps
(it's hard to see because it's transient and it never exists long enough to show up in process lists), is still a net loss.
I don't think we can explain away the performance because of the off-load onto a separate child process which (a) isn't persistent, and (b) is irrelevant for the benchmark.
We are seeing a major performance regression on linux between 2.20.4 and 2.20.5. With CPU jumping in a test case from 14% with about 280 processes in 2.20.4, to 195% in 2.20.5. Looks like there was some cache code removed which may be suspect.
We are seeing a major performance regression on linux between 2.20.4 and 2.20.5. With CPU jumping in a test case from 14% with about 280 processes in 2.20.4, to 195% in 2.20.5. Looks like there was some cache code removed which may be suspect.
We don't cache BootTime anymore to fix #837. We retrieve this value for each new process instantiation as a consequence to this call to CreateTime() (call to BootTime()), so we can later compare this value in a future call to IsRunning(). #945 may bring some improvements by getting uptime through the sysinfo syscall.
Here's an update from nearly a year later: I updated to v3
to see if anything changed, and it appears performance is about the same:
10642» go test -benchtime=20s -bench .
goos: linux
goarch: amd64
pkg: ser1.net/perf
cpu: Intel(R) Core(TM) i5-8265U CPU @ 1.60GHz
BenchmarkForking-8 1028 23357252 ns/op
--- BENCH: BenchmarkForking-8
perf_test.go:25: forking errors: 0
perf_test.go:25: forking errors: 0
perf_test.go:25: forking errors: 0
BenchmarkGoPSUtil-8 40 556564316 ns/op
--- BENCH: BenchmarkGoPSUtil-8
perf_test.go:32: GoPsUtil errors: 0
perf_test.go:32: GoPsUtil errors: 0
BenchmarkGoSysInfo-8 964 25360231 ns/op
--- BENCH: BenchmarkGoSysInfo-8
perf_test.go:39: GoSysInfo errors: 0
perf_test.go:39: GoSysInfo errors: 0
perf_test.go:39: GoSysInfo errors: 7
PASS
ok ser1.net/perf 76.207s
10643» grep shirou go.mod
github.com/shirou/gopsutil/v3 v3.21.8
A significant performance issue has been observed in the 'gopsutil' library on Linux when comparing versions 2.20.4 and 2.20.5. In version 2.20.4, a particular test case with approximately 280 processes exhibited a CPU usage of 14%. However, in version 2.20.5, the CPU usage skyrocketed to 195%. This performance degradation can be attributed to the removal of BootTime caching in order to address a specific scenario where a device initially reports an incorrect date (e.g., epoch or a more recent date set by a tool like 'fake_hwclock') and later corrects it using NTP. This results in 'gopsutil' reporting incorrect system uptime. The removal of this cache has caused the OpenTelemetry Collector (otel collector), which relies on 'gopsutil', to repeatedly access BootTime for every process and host metric with a high frequency, leading to increased resource usage. To mitigate this issue, one potential solution is to reintroduce BootTime caching in the collector's code. This change is expected to greatly improve performance since hostmetric continually calls 'gopsutil' in an "endless loop."
You can find further details and discussions regarding this issue in the following link:
https://github.com/shirou/gopsutil/issues/1283#issuecomment-1086268660 It is suggested that 'gopsutil' may need to support caching for such high-demand use cases, and the synchronization of NTP and BootTime should occur at defined intervals or be handled differently to address this specific scenario effectively.
@cforce
For now, please use the latest version, which is v3.23.12
. v2 is no longer maintained.
And #1579 introduces the host.EnableBootTimeCache()
function to cache boot time. As you commented, caching BootTime can cause problems, so this cache is disabled by default, and it is up to the user to turn it on. This feature will be released the end of this month.
Describe the bug
Processes()
is significantly more resource intensive than even forkingps
withexec.Command
and parsing the results.To Reproduce Download the attached go source code (and rename to
perf.go
it because github is dumb) and run it on the command line with:The argument is the number of iterations; you may need to start with fewer if it runs too long.
Expected behavior I'd expect the native code to be more resource efficient than spawning a child
ps
process. On my machine, I consistently see CPU use of between 2x and 4x CPU use, and much slower results:Environment (please complete the following information):
/etc/os-release
and the result ofuname -a
](trimmed some attributes out)
Edit
I noticed this while doing some development work on gotop; running both branches, for about an hour, it didn't stray off this: