Open sdalu opened 1 year ago
Hi,
As I mentioned in the previous issue we do not provide an arm64 build for FreeBSD, so to work on this I would need you to build and test PRs.
The procstat
metric is generated in the addMetric function here.
procstat_lookup,host=rork.home.sdalu.com,org_destination=IT,result=success pid_count=1i,result_code=0i,running=1i 1694812868000000000
Is this the actual output of procstat_lookup? Before diving much further into this I want to be certain that running is actually non-zero. If it is zero, then there will be no procstat
metric generated or if there were any errors updating the processes.
Finally, the data is all gathered via gopsutil's library, so I think we should try outside of telegraf as well.
If you create a directory and create two files:
main.go:
package main
import (
"fmt"
"os"
"github.com/shirou/gopsutil/process"
)
func main() {
currentPid := os.Getpid()
myself, err := process.NewProcess(int32(currentPid))
if err != nil {
panic(err)
}
fmt.Println(myself.Name())
fmt.Println(myself.String())
fmt.Println(myself.NumThreads())
fmt.Println(myself.RlimitUsage(true))
fmt.Println(myself.Status())
}
go.mod - replace the go version with whatever you have locally:
module test-process
go 1.21
And either run this directly via go run .
or build it go build .
and run the test-process
binary.
Hello! I am closing this issue due to inactivity. I hope you were able to resolve your problem, if not please try posting this question in our Community Slack or Community Forums or provide additional details in this issue and reqeust that it be re-opened. Thank you!
Sorry for the late answer
procstat_lookup,host=rork.home.sdalu.com,org_destination=IT,result=success pid_count=1i,result_code=0i,running=1i 1694812868000000000
Is this the actual output of procstat_lookup? Before diving much further into this I want to be certain that running is actually non-zero. If it is zero, then there will be no
procstat
metric generated or if there were any errors updating the processes.
Yes that's actual output
Finally, the data is all gathered via gopsutil's library, so I think we should try outside of telegraf as well. [...] And either run this directly via
go run .
or build itgo build .
and run thetest-process
binary.
Output is:
<nil>
{"pid":91329}
0 <nil>
[] not implemented yet
<nil>
Gopsutil is providing a nil name and other metrics, which means we are skipping the process. Here is the code in Telegraf, which checks for the nil name and commets that if this is nil we assume we are not getting anything else. Which based on the output, seems to also return default values or nil.
I would suggest an upstream issue as part of the gopsutil project to get this added or enabled there. You can use the example code I provided in my previous comment of a way to reproduce.
@sdalu,
I have put up https://github.com/influxdata/telegraf/pull/15272 which includes an update to gopsutil library. Your upstream issue appears to have been fixed back in March so it is likely that our last release already has this fix. Could you please download artifacts from that PR, which will be attached as a comment ~30mins from this message, and let me know if this resolves this issue?
Thanks!
I downloaded telegraf-1.31.0~553d972c_freebsd_armv7.tar.gz
and run
./telegraf-1.31.0/usr/bin/telegraf --config /usr/local/etc/telegraf.conf --debug
Got a panic
2024-05-02T13:09:40Z E! FATAL: [inputs.procstat] panicked: runtime error: invalid memory address or nil pointer dereference, Stack:
goroutine 147 [running]:
github.com/influxdata/telegraf/agent.panicRecover(0x4d410370)
/go/src/github.com/influxdata/telegraf/agent/agent.go:1202 +0x74
panic({0x67aa400, 0xc587b20})
/usr/local/go/src/runtime/panic.go:770 +0xfc
github.com/shirou/gopsutil/v3/process.(*Process).createTimeWithContext(0x4d0a0368, {0x8232a44, 0xc9983c0})
/go/pkg/mod/github.com/shirou/gopsutil/v3@v3.24.4/process/process_freebsd.go:121 +0x4c
github.com/shirou/gopsutil/v3/process.(*Process).CreateTimeWithContext(0x4d0a0368, {0x8232a44, 0xc9983c0})
/go/pkg/mod/github.com/shirou/gopsutil/v3@v3.24.4/process/process.go:310 +0x74
github.com/shirou/gopsutil/v3/process.NewProcessWithContext({0x8232a44, 0xc9983c0}, 0x3744)
/go/pkg/mod/github.com/shirou/gopsutil/v3@v3.24.4/process/process.go:218 +0x78
github.com/shirou/gopsutil/v3/process.NewProcess(...)
/go/pkg/mod/github.com/shirou/gopsutil/v3@v3.24.4/process/process.go:203
github.com/influxdata/telegraf/plugins/inputs/procstat.newProc(0x3744)
/go/src/github.com/influxdata/telegraf/plugins/inputs/procstat/process.go:38 +0x30
github.com/influxdata/telegraf/plugins/inputs/procstat.(*Procstat).gatherOld(0x4ccc6e48, {0x824a858, 0x4d40cae0})
/go/src/github.com/influxdata/telegraf/plugins/inputs/procstat/procstat.go:209 +0x848
github.com/influxdata/telegraf/plugins/inputs/procstat.(*Procstat).Gather(0x4ccc6e48, {0x824a858, 0x4d40cae0})
/go/src/github.com/influxdata/telegraf/plugins/inputs/procstat/procstat.go:166 +0x38
github.com/influxdata/telegraf/models.(*RunningInput).Gather(0x4d410370, {0x824a858, 0x4d40cae0})
/go/src/github.com/influxdata/telegraf/models/running_input.go:227 +0x2c4
github.com/influxdata/telegraf/agent.(*Agent).gatherOnce.func1()
/go/src/github.com/influxdata/telegraf/agent/agent.go:583 +0x70
created by github.com/influxdata/telegraf/agent.(*Agent).gatherOnce in goroutine 120
/go/src/github.com/influxdata/telegraf/agent/agent.go:581 +0xc0
goroutine 1 [semacquire]:
sync.runtime_Semacquire(0x4d310b68)
/usr/local/go/src/runtime/sema.go:62 +0x3c
sync.(*WaitGroup).Wait(0
2024-05-02T13:09:40Z E! PLEASE REPORT THIS PANIC ON GITHUB with stack trace, configuration, and OS information: https://github.com/influxdata/telegraf/issues/new/choose
Well that's no good! Can you file a second upstream issue please with that stack trace. It does appear that gopsutil's createTimeWithContext function is the cause of the crash.
Relevant telegraf.conf
Logs from Telegraf
System info
Telegraf 1.28.0 FreeBSD 13.2 arm64
Docker
No response
Steps to reproduce
Expected behavior
Some
procstat
andprocstat_lookup
metrics, like:Actual behavior
Only
procstat_lookup
is generated, noprocstat
Additional info
Don't know if behaviour is specific to
arm64
or the whole arm family. Tested onamd64
, and it's working fine, so it is not specific to FreeBSD