Open sdalu opened 2 months ago
I can pass go122 test -run Test_Process_CreateTime
on my QEMU environment. Any other information or could you provide a minimal reproducible code?
# go122 version
go version go1.22.1 freebsd/arm64
# uname -a
FreeBSD generic 14.0-RELEASE FreeBSD 14.0-RELEASE #0 releng/14.0-n265380-f9716eee8ab4: Fri Nov 10 08:59:18 UTC 2023 root@releng1.nyi.freebsd.org:/usr/obj/usr/src/arm64.aarch64/sys/GENERIC arm64
I'm just testing with the artifact freebsd_armv7.tar.gz provided https://github.com/influxdata/telegraf/pull/15272 . Hoping that @powersj could tell you more about the build. The computer on which I tested is a Raspberry PI 4.
I'm not a go user, how do I run go test -run Test_Process_CreateTime
?
I tried:
git clone https://github.com/shirou/gopsutil
cd gopsutil
go test -run Test_Process_CreateTime
# result I got:
? github.com/shirou/gopsutil/v3 [no test files]
@sdalu,
Can you run go test -v -run Test_Process_CreateTime ./process/...
@shirou,
In the original issue, I had the reporter use the following code. You can see his response for the output.
root@brain:~/gopsutil # go test -v -run Test_Process_CreateTime ./process/...
go: downloading golang.org/x/sys v0.20.0
go: downloading github.com/stretchr/testify v1.9.0
=== RUN Test_Process_CreateTime
--- PASS: Test_Process_CreateTime (0.00s)
PASS
ok github.com/shirou/gopsutil/v3/process 0.016s
Thank you for the information. This means that, at least in the test environment, the CreateTime()
is being retrieved without any issues. In other words, it is not a simple implementation mistake in the code or a problem with the struct definition. Do you have any other potential issues in mind?
@shirou,
Well gopsutil should not panic :)
In the original issue, I had sdalu, first get a process using the scripts own PID:
myself, err := process.NewProcess(int32(currentPid))
This returned nothing, yet the test does.
@sdalu,
Can you run the following go code:
go.mod:
module test-process
go 1.22
require github.com/shirou/gopsutil/v3 v3.24.4
require (
github.com/go-ole/go-ole v1.2.6 // indirect
github.com/lufia/plan9stats v0.0.0-20211012122336-39d0f177ccd0 // indirect
github.com/power-devops/perfstat v0.0.0-20210106213030-5aafc221ea8c // indirect
github.com/shoenig/go-m1cpu v0.1.6 // indirect
github.com/tklauser/go-sysconf v0.3.14 // indirect
github.com/tklauser/numcpus v0.8.0 // indirect
github.com/yusufpapurcu/wmi v1.2.4 // indirect
golang.org/x/sys v0.20.0 // indirect
)
main.go:
package main
import (
"fmt"
"os"
"github.com/shirou/gopsutil/v3/process"
)
func main() {
currentPid := os.Getpid()
myself, err := process.NewProcess(int32(currentPid))
if err != nil {
panic(err)
}
fmt.Println(myself.Name())
fmt.Println(myself.String())
fmt.Println(myself.NumThreads())
fmt.Println(myself.RlimitUsage(true))
fmt.Println(myself.Status())
time, err := myself.CreateTime()
if err != nil {
panic(err)
}
fmt.Println(time)
}
And either run this directly via go run . or build it go build . and run the test-process binary.
I'm using go 1.21
root@brain:~/t # go run .
test-process <nil>
{"pid":38018}
6 <nil>
[] not implemented yet
[running] <nil>
1715697644691
Well that is not what I was expecting. Just to confirm, this was run on your FreeBSD system that was showing the issue with Telegraf?
Yes. I re-run the telegraf artifact, just in case. It's crashing with the same panic. Should I try to build it instead of runing the artifact ?
@sdalu I have put up https://github.com/influxdata/telegraf/pull/15355 which will hopefully print out what the result of getKProc
call and the pid we are looking at. In 20-30mins can you download another Telegraf artifact and see what you get?
Thanks!
@powersj Here is the output:
root@brain:~/t/telegraf-1.31.0 # ./usr/bin/telegraf --config /usr/local/etc/telegraf.conf --debug
2024-05-14T20:07:16Z I! Loading config: /usr/local/etc/telegraf.conf
2024-05-14T20:07:16Z W! DeprecationWarning: Option "fielddrop" of plugin "inputs.system" deprecated since version 1.29.0 and will be removed in 2.0.0: use 'fieldexclude' instead
2024-05-14T20:07:16Z W! DeprecationWarning: Option "dns_lookup" of plugin "inputs.ntpq" deprecated since version 1.24.0 and will be removed in 2.0.0: add '-n' to 'options' instead to skip DNS lookup
2024-05-14T20:07:16Z I! Starting Telegraf 1.31.0-74404b62 brought to you by InfluxData the makers of InfluxDB
2024-05-14T20:07:16Z I! Available plugins: 233 inputs, 9 aggregators, 32 processors, 25 parsers, 60 outputs, 4 secret-stores
2024-05-14T20:07:16Z I! Loaded inputs: apache cpu disk diskio exec (2x) internal mem mysql net netstat ntpq processes procstat (10x) redis smart swap system zfs
2024-05-14T20:07:16Z I! Loaded aggregators:
2024-05-14T20:07:16Z I! Loaded processors: converter override (4x) regex
2024-05-14T20:07:16Z I! Loaded secretstores:
2024-05-14T20:07:16Z I! Loaded outputs: influxdb_v2
2024-05-14T20:07:16Z I! Tags enabled: host=brain.home.sdalu.com
2024-05-14T20:07:16Z W! Deprecated inputs: 0 and 1 options
2024-05-14T20:07:16Z I! [agent] Config: Interval:20s, Quiet:false, Hostname:"brain.home.sdalu.com", Flush Interval:30s
2024-05-14T20:07:16Z D! [agent] Initializing plugins
2024-05-14T20:07:16Z W! [inputs.smart] nvme not found: verify that nvme is installed and it is in your PATH (or specified in config) to gather vendor specific attributes: provided path does not exist: []
2024-05-14T20:07:16Z W! DeprecationWarning: Value "false" for option "ignore_protocol_stats" of plugin "inputs.net" deprecated since version 1.27.3 and will be removed in 1.36.0: use the 'inputs.nstat' plugin instead
2024-05-14T20:07:16Z D! [agent] Connecting outputs
2024-05-14T20:07:16Z D! [agent] Attempting connection to [outputs.influxdb_v2]
2024-05-14T20:07:16Z D! [agent] Successfully connected to outputs.influxdb_v2
2024-05-14T20:07:16Z D! [agent] Starting service inputs
getting process with pid: 40544
k: (*process.KinfoProc)(nil)
2024-05-14T20:07:20Z E! FATAL: [inputs.procstat] panicked: runtime error: invalid memory address or nil pointer dereference, Stack:
goroutine 149 [running]:
github.com/influxdata/telegraf/agent.panicRecover(0x4d638140)
/go/src/github.com/influxdata/telegraf/agent/agent.go:1202 +0x74
panic({0x696a390, 0xc88a150})
/usr/local/go/src/runtime/panic.go:770 +0xfc
github.com/shirou/gopsutil/v3/process.(*Process).createTimeWithContext(0x4ce30098, {0x843a4c4, 0xcc9ccd8})
/go/pkg/mod/github.com/powersj/gopsutil/v3@v3.0.0-20240514151642-919283ff5f48/process/process_freebsd.go:123 +0xac
github.com/shirou/gopsutil/v3/process.(*Process).CreateTimeWithContext(0x4ce30098, {0x843a4c4, 0xcc9ccd8})
/go/pkg/mod/github.com/powersj/gopsutil/v3@v3.0.0-20240514151642-919283ff5f48/process/process.go:312 +0x74
github.com/shirou/gopsutil/v3/process.NewProcessWithContext({0x843a4c4, 0xcc9ccd8}, 0x9e60)
/go/pkg/mod/github.com/powersj/gopsutil/v3@v3.0.0-20240514151642-919283ff5f48/process/process.go:220 +0x78
github.com/shirou/gopsutil/v3/process.NewProcess(0x9e60)
/go/pkg/mod/github.com/powersj/gopsutil/v3@v3.0.0-20240514151642-919283ff5f48/process/process.go:205 +0x90
github.com/influxdata/telegraf/plugins/inputs/procstat.newProc(0x9e60)
/go/src/github.com/influxdata/telegraf/plugins/inputs/procstat/process.go:38 +0x1c
github.com/influxdata/telegraf/plugins/inputs/procstat.(*Procstat).gatherOld(0x4d36cf78, {0x8452d48, 0x4d367308})
/go/src/github.com/influxdata/telegraf/plugins/inputs/procstat/procstat.go:223 +0x810
github.com/influxdata/telegraf/plugins/inputs/procstat.(*Procstat).Gather(0x4d36cf78, {0x8452d48, 0x4d367308})
/go/src/github.com/influxdata/telegraf/plugins/inputs/procstat/procstat.go:185 +0x38
github.com/influxdata/telegraf/models.(*RunningInput).Gather(0x4d638140, {0x8452d48, 0x4d367308})
/go/src/github.com/influxdata/telegraf/models/running_input.go:228 +0x2c4
github.com/influxdata/telegraf/agent.(*Agent).gatherOnce.func1()
/go/src/github.com/influxdata/telegraf/agent/agent.go:583 +0x70
created by github.com/influxdata/telegraf/agent.(*Agent).gatherOnce in goroutine 146
/go/src/github.com/influxdata/telegraf/agent/agent.go:581 +0xc0
goroutine
2024-05-14T20:07:20Z E! PLEASE REPORT THIS PANIC ON GITHUB with stack trace, configuration, and OS information: https://github.com/influxdata/telegraf/issues/new/choose
Thanks for trying that out!
getting process with pid: 40544 k: (*process.KinfoProc)(nil)
@shirou any reason why the kinfoproc would be nil? Does that mean gopsutil failed to parse the information but didn't generate an error?
That print statement is from my own fork, printing out the k value before we attempt to create time with context:
I read the code and it doesn't seem like err
is nil and kInfoProc
is nil at the same time. However, this issue is occurring in practice. Therefore, I have created a PR #1653 to add a nil check. Could you please take a look?
@sdalu,
As before, can you grab the telegraf artifacts in few minutes from this post at: https://github.com/influxdata/telegraf/pull/15397
@shirou - thanks for looking into this!
@powersj here is the result (looks the same):
root@brain:~/telegraf-1.31.0 # ./usr/bin/telegraf --config /usr/local/etc/telegraf.conf --debug
2024-05-23T19:41:13Z I! Loading config: /usr/local/etc/telegraf.conf
2024-05-23T19:41:13Z W! DeprecationWarning: Option "fielddrop" of plugin "inputs.system" deprecated since version 1.29.0 and will be removed in 2.0.0: use 'fieldexclude' instead
2024-05-23T19:41:13Z W! DeprecationWarning: Option "dns_lookup" of plugin "inputs.ntpq" deprecated since version 1.24.0 and will be removed in 2.0.0: add '-n' to 'options' instead to skip DNS lookup
2024-05-23T19:41:13Z I! Starting Telegraf 1.31.0-ef79d2b9 brought to you by InfluxData the makers of InfluxDB
2024-05-23T19:41:13Z I! Available plugins: 233 inputs, 9 aggregators, 32 processors, 26 parsers, 60 outputs, 4 secret-stores
2024-05-23T19:41:13Z I! Loaded inputs: apache cpu disk diskio exec (2x) internal mem mysql net netstat ntpq processes procstat (10x) redis smart swap system zfs
2024-05-23T19:41:13Z I! Loaded aggregators:
2024-05-23T19:41:13Z I! Loaded processors: converter override (4x) regex
2024-05-23T19:41:13Z I! Loaded secretstores:
2024-05-23T19:41:13Z I! Loaded outputs: influxdb_v2
2024-05-23T19:41:13Z I! Tags enabled: host=brain.home.sdalu.com
2024-05-23T19:41:13Z W! Deprecated inputs: 0 and 1 options
2024-05-23T19:41:13Z I! [agent] Config: Interval:20s, Quiet:false, Hostname:"brain.home.sdalu.com", Flush Interval:30s
2024-05-23T19:41:13Z D! [agent] Initializing plugins
2024-05-23T19:41:13Z W! [inputs.smart] nvme not found: verify that nvme is installed and it is in your PATH (or specified in config) to gather vendor specific attributes: provided path does not exist: []
2024-05-23T19:41:13Z W! DeprecationWarning: Value "false" for option "ignore_protocol_stats" of plugin "inputs.net" deprecated since version 1.27.3 and will be removed in 1.36.0: use the 'inputs.nstat' plugin instead for protocol stats
2024-05-23T19:41:13Z D! [agent] Connecting outputs
2024-05-23T19:41:13Z D! [agent] Attempting connection to [outputs.influxdb_v2]
2024-05-23T19:41:13Z D! [agent] Successfully connected to outputs.influxdb_v2
2024-05-23T19:41:13Z D! [agent] Starting service inputs
2024-05-23T19:41:20Z E! FATAL: [inputs.procstat] panicked: runtime error: invalid memory address or nil pointer dereference, Stack:
goroutine 61 [running]:
github.com/influxdata/telegraf/agent.panicRecover(0x4d36c5a0)
/go/src/github.com/influxdata/telegraf/agent/agent.go:1202 +0x74
panic({0x68b6d10, 0xc708b80})
/usr/local/go/src/runtime/panic.go:770 +0xfc
github.com/shirou/gopsutil/v3/process.(*Process).createTimeWithContext(0x4d0cc128, {0x8361124, 0xcb1af28})
/go/pkg/mod/github.com/shirou/gopsutil/v3@v3.24.5-0.20240523132159-edb58e3dd10d/process/process_freebsd.go:122 +0x4c
github.com/shirou/gopsutil/v3/process.(*Process).CreateTimeWithContext(0x4d0cc128, {0x8361124, 0xcb1af28})
/go/pkg/mod/github.com/shirou/gopsutil/v3@v3.24.5-0.20240523132159-edb58e3dd10d/process/process.go:310 +0x74
github.com/shirou/gopsutil/v3/process.NewProcessWithContext({0x8361124, 0xcb1af28}, 0x624)
/go/pkg/mod/github.com/shirou/gopsutil/v3@v3.24.5-0.20240523132159-edb58e3dd10d/process/process.go:218 +0x78
github.com/shirou/gopsutil/v3/process.NewProcess(...)
/go/pkg/mod/github.com/shirou/gopsutil/v3@v3.24.5-0.20240523132159-edb58e3dd10d/process/process.go:203
github.com/influxdata/telegraf/plugins/inputs/procstat.newProc(0x624)
/go/src/github.com/influxdata/telegraf/plugins/inputs/procstat/process.go:38 +0x30
github.com/influxdata/telegraf/plugins/inputs/procstat.(*Procstat).gatherOld(0x4d29e8f8, {0x8379320, 0x4d3951e8})
/go/src/github.com/influxdata/telegraf/plugins/inputs/procstat/procstat.go:223 +0x810
github.com/influxdata/telegraf/plugins/inputs/procstat.(*Procstat).Gather(0x4d29e8f8, {0x8379320, 0x4d3951e8})
/go/src/github.com/influxdata/telegraf/plugins/inputs/procstat/procstat.go:185 +0x38
github.com/influxdata/telegraf/models.(*RunningInput).Gather(0x4d36c5a0, {0x8379320, 0x4d3951e8})
/go/src/github.com/influxdata/telegraf/models/running_input.go:228 +0x2c4
github.com/influxdata/telegraf/agent.(*Agent).gatherOnce.func1()
/go/src/github.com/influxdata/telegraf/agent/agent.go:583 +0x70
created by github.com/influxdata/telegraf/agent.(*Agent).gatherOnce in goroutine 112
/go/src/github.com/influxdata/telegraf/agent/agent.go:581 +0xc0
goroutine 1 [
2024-05-23T19:41:20Z E! PLEASE REPORT THIS PANIC ON GITHUB with stack trace, configuration, and OS information: https://github.com/influxdata/telegraf/issues/new/choose
I downloaded: telegraf-1.31.0~ef79d2b9_freebsd_armv7.tar.gz
The parseKinfoProc function returns a struct instance, not a pointer. Therefore, I agree that it should be impossible for a pointer to this value to be nil. However, it is actually happening, so I am very confused.
@shirou - can that be changed? Otherwise you would need to check if any of the pointer fields are nil?
Describe the bug When running telegraph 1.31.0~553d972c, with gopsutil v3.24.4, it crash in the createTimeWithContext.
Issue initially reported: https://github.com/influxdata/telegraf/issues/13933
Environment (please complete the following information): FreBSD:
14.0-RELEASE-p4 14.0-RELEASE-p4 14.0-RELEASE-p4
FreeBSD brain.home.sdalu.com 14.0-RELEASE-p4 FreeBSD 14.0-RELEASE-p4 #0 releng/14.0-n265400-4edf3b80733e: Wed Jan 17 19:32:19 CET 2024 root@brain.home.sdalu.com:/usr/obj/usr/src/arm64.aarch64/sys/BROADCOM arm64