Closed jicki closed 10 months ago
Could you please look at the /sys/devices/system/cpu
directory and /sys/devices/system/cpu/offline
files of the node where node_exporter is panic?
This is a bug in the github.com/prometheus/procfs
package in v0.10.0
https://github.com/prometheus/procfs/blob/dd377c72009a3d077169f6f48c4027713ceeff5e/sysfs/system_cpu.go#L173-L186
func filterOfflineCPUs(offlineCpus *[]uint16, cpus *[]string) error {
for i, cpu := range *cpus {
cpuName := strings.TrimPrefix(filepath.Base(cpu), "cpu")
cpuNameUint16, err := strconv.Atoi(cpuName)
if err != nil {
return err
}
if binSearch(uint16(cpuNameUint16), offlineCpus) {
*cpus = append((*cpus)[:i], (*cpus)[i+1:]...)
}
}
return nil
}
If there are many offline cpus, *cpus = append((*cpus)[:i], (*cpus)[i+1:]...)
The length of the cpus will change every time, causing the subsequent ones to go out of bounds , this bug has been fixed in later versions of github.com/prometheus/procfs
, you can upgrade the version of node_exporter
Will there be an update to the prom/node-exporter
docker container to resolve this?
When is expected new release, rebuild fixing this issue?
I am running into the same issue with the relase v1.6.1 on a Jetson Orin device (use Prometheus-kube-stack), when the updated version will be available?
goroutine 73 [running]: github.com/prometheus/procfs/sysfs.filterOfflineCPUs(0x40002c2e00?, 0x4000105bf0) /go/pkg/mod/github.com/prometheus/procfs@v0.10.0/sysfs/system_cpu.go:181 +0x214 github.com/prometheus/procfs/sysfs.FS.SystemCpufreq({{0xfffff3572dd5?, 0x9?}}) /go/pkg/mod/github.com/prometheus/procfs@v0.10.0/sysfs/system_cpu.go:209 +0x1c8 github.com/prometheus/node_exporter/collector.(*cpuFreqCollector).Update(0x0?, 0x0?) /app/collector/cpufreq_linux.go:51 +0x38 github.com/prometheus/node_exporter/collector.execute({0x72858b, 0x7}, {0x83f4a8, 0x40000433a0}, 0x0?, {0x83efc8, 0x40000b7200}) /app/collector/collector.go:161 +0x60 github.com/prometheus/node_exporter/collector.NodeCollector.Collect.func1({0x72858b?, 0x0?}, {0x83f4a8?, 0x40000433a0?}) /app/collector/collector.go:152 +0x3c created by github.com/prometheus/node_exporter/collector.NodeCollector.Collect /app/collector/collector.go:151 +0x98
As a workaround, we can use the master
tag image for now.
As a workaround, we can use the
master
tag image for now.
@uqix , thank you for the suggestion, because I am using prometheus-kube-stack helm, use a master tag version triggerd a validation error, (I am not an experter in helm, don't know how to compress that), so I use the v1.5.0 for the mement.
Host operating system: output of
uname -a
node_exporter version: output of
node_exporter --version
node_exporter command line flags
node_exporter log output
Are you running node_exporter in Docker?