deniszh / collectd-iostat-python

Collectd-iostat-python is an iostat plugin for collectd that allows you to graph Linux iostat metrics in graphite or other output formats that are supported by collectd.
MIT License
47 stars 28 forks source link

Unhandled python exception in read callback: OSError: [Errno 13] Permission denied #23

Open Lotus907efi opened 6 years ago

Lotus907efi commented 6 years ago

I have setup the collectd_iostat_python.py module in /usr/lib64/collectd/iostatpy/ and used the following config file:

Globals true ModulePath "/usr/lib64/collectd/iostat_specialpy" Import "collectd_iostat_python" Path "/usr/bin/iostat" Interval 30 IostatInterval 12 Count 2 Verbose false NiceNames true PluginName collectd_iostat_python

When I run collectd from the command line as root using the command:

/usr/sbin/collectd -f -C /etc/collectd.conf

everything seems fine, but when I start it with:

systemctl start collectd

it starts but when I check the statususing:

systemctl status -l collectd

I see this error displayed:

May 31 20:58:52 overcloud-controller-2 collectd[676789]: Unhandled python exception in read callback: OSError: [Errno 13] Permission denied

If I run the collectd_iostat_python.py module using:

python /usr/lib64/collectd/iostatpy/collectd_iostat_python.py

as root or a regular user I get things like this back:

sda.wrqm_s:0.0 sda.rrqm_s:0.0 sda.await:0.0 sda.svctm:0.0 sda.kB_read:0.0 sda.avgrq_sz:9.09 sda.w_await:0.0 sda.avgqu_sz:0.0 sda.r_s:0.0 sda.kB_wrtn_s:72.75 sda.kB_read_s:0.0 sda.rkB_s:0.0 sda.w_s:16.0 sda.kB_wrtn:145.0 sda.wkB_s:72.75 sda.tps:16.0 sda.r_await:0.0 sda._util:0.0

what could be causing this permission denied error? I tried adding executable permission to the collectd_iostat_python.py file I have installed in /usr/lib64/collectd/iostat_specialpy/ and checked the permissions on the directory but still getting the same error when I start the collectd service.

deniszh commented 6 years ago

Hard to say... I think it requires root permissions for something. Not for iostat itself, it has sudo bit probably, but there's a ton of other functionality - pyudev etc.

deniszh commented 6 years ago

According to https://github.com/collectd/collectd/blob/master/contrib/systemd.collectd.service it probably requires some extra capabilities.

Lotus907efi commented 6 years ago

I checked when the collectd process is running and it looks like it runs as root:

root 15407 0.1 0.0 1283992 11048 ? Ssl May31 1:24 /usr/sbin/collectd

but I do not really know about the python plugin ot the modules it imports. I don't know if they also run as root or not.

BTW, the version of collectd I am using is 5.8.0 and the python plugin is linked against the following:

$ ldd /usr/lib64/collectd/python.so linux-vdso.so.1 => (0x00007ffdd84e5000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fdbef1c3000) libdl.so.2 => /lib64/libdl.so.2 (0x00007fdbeefbf000) libutil.so.1 => /lib64/libutil.so.1 (0x00007fdbeedbb000) libm.so.6 => /lib64/libm.so.6 (0x00007fdbeeab9000) libpython2.7.so.1.0 => /lib64/libpython2.7.so.1.0 (0x00007fdbee6ed000) libc.so.6 => /lib64/libc.so.6 (0x00007fdbee329000) /lib64/ld-linux-x86-64.so.2 (0x0000558e9a637000)

The actual version of python installed on the system is 2.7.5

I just tried setting the following in the /usr/lib/systemd/system/collectd.service file:

[Service] ExecStart=/usr/sbin/collectd Restart=on-failure Type=notify CapabilityBoundingSet=CAP_SETUID CAP_SETGID CAP_SYS_ADMIN CAP_NET_ADMIN

then I did

systemctl daemon-reload systemctl restart collectd

but I still see the following in the status messages:

Jun 01 15:12:19 overcloud-controller-2 collectd[867193]: Unhandled python exception in read callback: OSError: [Errno 13] Permission denied

so all of that seems to have made no difference.

Lotus907efi commented 6 years ago

I just tried starting the collectd daemon from the command line as root rather than starting it using systemctl and it seems that this error about permission denied might indeed be not happening this way. I will have to investigate this further.

If I start the collectd service manually as root from the command line it does run and does not through this error, but I do not see any iostat related metrics appearing in my prometheus system like I would expect.