Closed piotrp closed 4 months ago
You will note that you are getting ENOENT multiple times per device.
The plugin has two places where that error message can be printed, once before diskTags are collected and again after. In both cases, the exact same function call, d.diskInfo(devName)
is called.
Because no named templates are defined in your config, the first warning is never printed and we immediately return the deviceName and links without any further processing. Then because you defined a device tag, we run the same call and immediately check the error and print the warning. My question is why we aren't checking the error the first time around as well.
I think we should, and that would actually mean more warnings in your logs, as now we would warn twice per device: once trying to get the disk name and again getting the disk tags.
I've put up https://github.com/influxdata/telegraf/pull/15667 which will only print the messages once and include some additional details about what file was being read. Artifacts that you can use will get added to the PR in 20-30mins by the tiger-bot. Please give those a shot and let me know.
Works great, I get unique warnings for each device, with helpful details, and it isn't repeated on each metric collection.
2024-07-25T15:31:45Z I! Starting Telegraf 1.32.0-5ab72b53 brought to you by InfluxData the makers of InfluxDB
2024-07-25T15:31:45Z I! Available plugins: 234 inputs, 9 aggregators, 32 processors, 26 parsers, 62 outputs, 6 secret-stores
2024-07-25T15:31:45Z I! Loaded inputs: cpu disk diskio elasticsearch internal jolokia2_agent (2x) kernel linux_sysctl_fs logstash mem net nfsclient nstat processes swap system
2024-07-25T15:31:45Z I! Loaded aggregators:
2024-07-25T15:31:45Z I! Loaded processors: starlark (2x) strings
2024-07-25T15:31:45Z I! Loaded secretstores:
2024-07-25T15:31:45Z I! Loaded outputs: influxdb
2024-07-25T15:31:45Z I! Tags enabled: env=tools host=tools
2024-07-25T15:31:45Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"tools", Flush Interval:10s
2024-07-25T15:31:50Z W! [inputs.diskio] Unable to gather disk name for "loop0": error reading /dev/loop0: no such file or directory
2024-07-25T15:31:50Z W! [inputs.diskio] Unable to gather disk tags for "loop0": error reading /dev/loop0: no such file or directory
2024-07-25T15:31:50Z W! [inputs.diskio] Unable to gather disk name for "sda": error reading /dev/sda: no such file or directory
2024-07-25T15:31:50Z W! [inputs.diskio] Unable to gather disk tags for "sda": error reading /dev/sda: no such file or directory
2024-07-25T15:31:50Z W! [inputs.diskio] Unable to gather disk name for "sda1": error reading /dev/sda1: no such file or directory
2024-07-25T15:31:50Z W! [inputs.diskio] Unable to gather disk tags for "sda1": error reading /dev/sda1: no such file or directory
2024-07-25T15:31:50Z W! [inputs.diskio] Unable to gather disk name for "sda2": error reading /dev/sda2: no such file or directory
2024-07-25T15:31:50Z W! [inputs.diskio] Unable to gather disk tags for "sda2": error reading /dev/sda2: no such file or directory
Relevant telegraf.conf
Logs from Telegraf
System info
Telegraf 1.31.2, in container.
Docker
No response
Steps to reproduce
Use provided config in unprivileged Docker container (or a container created with
machinectl
/systemd-nspawn
).device_tags
must be present in configuration.Expected behavior
No error logged or a better error message that actually indicates what is missing.
I tried to deploy the same file to real environment and a test container, and noticed that logs in my container contain multiple warnings. Considering that
device_tags
works on a best-effort basis this warning probably shouldn't be logged.Actual behavior
Multiple warnings logged on each collection interval:
Additional info
Traces for
loop0
device:All ENOENT messages reported by OS: