Closed courtarro closed 3 years ago
@courtarro can you try with Telegraf 1.17.0?
@p-zak I just upgraded via the InfluxDB PPA to 1.17.0 and the result is the same, unfortunately.
For comparison, here's what it looks like when I test from the command line:
root@prismo:/usr/local/sbin# sudo -u telegraf -s
telegraf@prismo:/usr/local/sbin$ whoami
telegraf
telegraf@prismo:/usr/local/sbin$ id
uid=999(telegraf) gid=998(telegraf) groups=998(telegraf)
telegraf@prismo:/usr/local/sbin$ ./smartctl --all /dev/sda
smartctl 7.0 2018-12-30 r5164 [x86_64-linux-4.15.0-129-generic] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org
Smartctl open device: /dev/sda failed: Permission denied
telegraf@prismo:/usr/local/sbin$ sudo ./smartctl --all /dev/sda
(WORKS)
I have tried to reproduce this behaviour and here is what I got:
System info: Ubuntu 18.04 (bionic), bare-metal Telegraf version: 1.17.0
sudo adduser telegraf_test
sudo visudo
Cmnd_Alias SMARTCTL = /usr/sbin/smartctl
telegraf_test ALL=(ALL) NOPASSWD: SMARTCTL
Defaults!SMARTCTL !logfile, !syslog, !pam_session
sudo su telegraf_test
id
uid=1003(telegraf_test) gid=1003(telegraf_test) groups=1003(telegraf_test)
telegraf_test@XXX:/home/XXX/telegraf$ /usr/sbin/smartctl --scan
/dev/sda -d scsi # /dev/sda, SCSI device
telegraf_test@XXX:/home/XXX/telegraf$ sudo /usr/sbin/smartctl --scan
/dev/sda -d scsi # /dev/sda, SCSI device
[[inputs.smart]]
path = "/usr/sbin/smartctl"
use_sudo = true
telegraf_test@XXX:/home/XXX/telegraf$ ./telegraf --config=telegraf.conf --test
2021-01-18T11:57:18Z I! Starting Telegraf
2021-01-18T11:57:18Z D! [agent] Initializing plugins
2021-01-18T11:57:18Z D! [agent] Starting service inputs
boot_time=1607077935i,context_switches=8447125900i,entropy_avail=3027i,interrupts=2862299797i,processes_forked=1921046i 1610971038000000000
2021-01-18T11:57:18Z D! [agent] Stopping service inputs
2021-01-18T11:57:18Z D! [agent] Input channel closed
2021-01-18T11:57:18Z D! [agent] Stopped Successfully
> smart_device,capacity=512110190592,device=sda,enabled=Enabled,host=XXX,model=XXX,serial_no=XXX,wwn=XXX exit_status=0i,health_ok=true,temp_c=22i,udma_crc_errors=0i 1610971038000000000
telegraf_test@XXX:/home/XXX/telegraf$ /usr/sbin/smartctl --scan
/dev/sda -d scsi # /dev/sda, SCSI device
telegraf_test@XXX:/home/XXX/telegraf$ sudo /usr/sbin/smartctl --scan
[sudo] password for telegraf_test:
telegraf_test is not in the sudoers file. This incident will be reported.
telegraf_test@XXX:/home/XXX/telegraf$ ./telegraf --config=telegraf.conf --test
2021-01-18T13:45:37Z I! Starting Telegraf
2021-01-18T13:45:37Z D! [agent] Initializing plugins
2021-01-18T13:45:37Z D! [agent] Starting service inputs
boot_time=1607077935i,context_switches=8454885756i,entropy_avail=2214i,interrupts=2865442312i,processes_forked=1924482i 1610977537000000000
2021-01-18T13:45:37Z E! [inputs.smart] Error in plugin: failed to run command '/usr/sbin/smartctl [--scan]': exit status 1 - sudo: a password is required
2021-01-18T13:45:37Z D! [agent] Stopping service inputs
2021-01-18T13:45:37Z D! [agent] Input channel closed
2021-01-18T13:45:37Z D! [agent] Stopped Successfully
2021-01-18T13:45:37Z E! [telegraf] Error running agent: input plugins recorded 1 errors
As you can see, I have no such problem with permission. Could you take the same steps and write your inputs and outputs?
Also, let me know if you have any entries in sudoers
file about telegraf group.
@courtarro Did you have time to check this?
@p-zak Okay, I finally nailed this down. It's an estoteric issue related to a change made per the ping
input readme. I'm also using that input plugin, and to so with the "native" option, I added the suggested lines to a systemd
override file:
[Service]
CapabilityBoundingSet=CAP_NET_RAW
AmbientCapabilities=CAP_NET_RAW
If I understand correctly, this suggestion is not ideal. Because CapabilityBoundingSet
is being set, the permissions obtainable via sudo
are more restricted than they would otherwise be, and this causes Telegraf to be unable to perform sudo
successfully to run the smartctl
command.
Instead, I changed the override file for ping to:
[Service]
AmbientCapabilities=CAP_NET_RAW
This enables the ping input plugin to do its job, while not limiting the sudo
command used by the smart
plugin. So now everything works. I suggest updating the ping readme to use this particular configuration instead of the current one (remove the CapabilityBoundingSet
clause).
for further readers: there are other limits as well, that can mess around with your config. in my case it was the DynamicUser=yes
statement, which enabled some sandboxing, including NoNewPrivileges=true
. Overriding this via systemctl edit
won't actually do anything, you need to use systemctl edit --full
to have systemd place a new copy to /etc
I'm unable to get the
smart
plugin to work with a locally-built and installed version ofsmartmontools
usingsudo
. Mytelegraf
runs as its own user (telegraf
), and I've got asudoers
clause set up to enable passwordless execution of/usr/local/sbin/smartctl
by Telegraf, yet I get an error.The log entries are visible below. I think the key message is "sudo: unable to change to root gid: Operation not permitted". I don't understand what this means or why it's appearing. When I manually run
sudo -u telegraf
to impersonate Telegraf, I'm able to runsudo -n /usr/local/sbin/smartctl --scan
just fine, no password needed. Any idea what might be wrong with my configuration?Relevant
telegraf.conf
:Relevant
sudoers
entries:System info:
Running Telegraf version
1.13.0-1
for Ubuntu Bionic (18.04)Expected behavior:
Telegraf runs
smartctl
and gathers the relevant metrics.Actual behavior:
Telegraf fails and the following log entries appear in its
systemd
log: