Open 7thsonch opened 4 years ago
Did you try with the parameter -k KEYRING_FILE
or --keyring KEYRING_FILE
?
Yes I did.
As user "nagios":
nagios@ host:~$ /usr/lib/nagios/plugins/check_ceph_osd -i nagios -k /var/lib/nagios/ceph.client.nagios.keyring --host 10.55.0.3 --out
OSD ERROR: 2020-08-26 10:18:32.711 7f2bfead3700 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.nagios.keyring: (13) Permission denied
As root:
root@host:~# /usr/lib/nagios/plugins/check_ceph_osd -i nagios -k /var/lib/nagios/ceph.client.nagios.keyring --host 10.55.0.1 --out
OSD OK
Up OSDs: osd.0 osd.1
Down+In OSDs:
Down+Out OSDs:
| 'osd_up'=2 'osd_down_in'=0;;2 'osd_down_out'=0;;2
It is just a permission issue of the keyring file, it should be readable by the user nagios
unable to find a keyring on /etc/pve/priv/ceph.client.nagios.keyring: (13) Permission denied
to test, just do as user nagios: cat /etc/pve/priv/ceph.client.nagios.keyring
As I said in my initial comment /etc/pve/priv/
is a special cluster file system where all files are owned by root with no read permissions for any other user. So there is no chance that user nagios can read /etc/pve/priv/ceph.client.nagios.keyring
Thats why I placed the keyring at /var/lib/nagios/ceph.client.nagios.keyring
but even if I set the keyring to this location (with the --keyring or -k parameter) it still tries to use /etc/pve/priv/ceph.client.nagios.keyring
Can you try, as user nagios:
/usr/bin/ceph --id nagios --keyring /var/lib/nagios/ceph.client.nagios.keyring osd status
it is what the nagios plugin is doing...
Similar error:
nagios@host:~$ /usr/bin/ceph --id nagios --keyring /var/lib/nagios/ceph.client.nagios.keyring osd status
2020-08-26 14:48:36.801 7f0c73727700 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.nagios.keyring: (13) Permission denied
Error EACCES: access denied: does your client key have mgr caps? See http://docs.ceph.com/docs/master/mgr/administrator/#client-authentication
So ceph still tries to use /etc/pve/priv/ceph.client.nagios.keyring :-(
Well I think the easiest and maybe only solution will be to run the command with sudo
Is the user nagios able to read the ceph.conf ?
A possible workaround is to define the keyring in a nagios ceph.conf, and use it with the client.
nagios.ceph.conf
(a modified copy of your normal ceph.conf):
[global]
keyring = /var/lib/nagios/ceph.client.nagios.keyring
fsid = ...
and then test with:
/usr/bin/ceph -c /var/lib/nagios/nagios.ceph.conf --id nagios osd status
If it works, you can then:
/usr/lib/nagios/plugins/check_ceph_osd -c /var/lib/nagios/nagios.ceph.conf -i nagios --host 10.55.0.1 --out
If it doesn't work, I'm out of idea, and you should use sudo
I ran into the same after upgrading the checks from 1.5.5 to 1.5.6.
The keyring file defined in the given ceph.conf
now takes precedence of the the keyring defined with the -k
parameter.
In my case the one defined in ceph.conf
didn't exists for the nagios
user (/etc/pve/priv/$cluster.$name.keyring
) and it errored on that without trying the file (which did exist with proper permissions) specified by -k
.
I changed my setup to simply put the correct path to the nagios keyring into ceph.conf
used by the checks and removed the -k
flag. I guess it's cleaner that way anyway.
@rfpronk Hi there, I'm having exactly this problem right now with Prox+Ceph. Can you share what exactly you put into your /etc/pve/ceph.conf to make this work? Do you still need the nagios keyring file at all? I can't seem to work out what to put into ceph.conf to get it working. Thanks!
What I did is to create a copy of ceph.conf in /etc/ceph and call it ceph_icinga.conf for example. In this file you have to change the path of the keyring-file to a different directory like /etc/icinga2/ for example:
ceph_icinga.conf:
...
[client] keyring = /etc/icinga2/ceph.client.nagios.keyring
...
Owner of this file has to be nagios. And it has to have read permissions for nagios user.
You have to do these steps on each Node of your proxmox-cluster where ceph is running obviously.
On your icinga server I did the following:
In /etc/icinga2/zones.d/globale-template/commands.conf I created an object for each OSD I wanna monitor. You have to change -k and -c parameter accordingly:
object CheckCommand "pve_ceph_osd5" { import "plugin-check-command" command = [ "/usr/lib/nagios/plugins/check_ceph_osd" ] arguments = { "-i" = "nagios" "-e" = "/usr/bin/ceph" "-k" = "/etc/icinga2/ceph.client.nagios.keyring" "-c" = "/etc/ceph/ceph_icinga.conf" "-H" = "..." "-I" = "5" } }
I've just upgraded our Proxmox cluster to 6.2 including the upgrade from Ceph Luminous to Nautilus. All other checks (check_ceph_health, check_ceph_mon, check_ceph_df) still work as expected but check_ceph_osd is refusing to work.
I'm using the following command:
/usr/lib/nagios/plugins/check_ceph_osd -i nagios --key /var/lib/nagios/ceph.client.nagios.keyring --host 10.55.0.1 --out
followed by this error:
OSD ERROR: 2020-08-26 09:32:03.862 7fbe53968700 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.nagios.keyring: (2) No such file or directory 2020-08-26 09:32:03.862 7fbe53968700 -1 AuthRegistry(0x7fbe4c081ff8) no keyring found at /etc/pve/priv/ceph.client.nagios.keyring, disabling cephx
I don't know why ceph is looking for a key in /etc/pve/priv/ceph.client.nagios.keyring If I copy my key from /var/lib/nagios/ceph.client.nagios.keyring to /etc/pve/priv/ceph.client.nagios.keyring the command works as expected but only as user root. In Proxmox,
/etc/pve/priv
is a special cluster file system where all files are owned by root with no read permissions for any other user. Of course I would like to avoid running the check as root.Keyring has been created with
ceph auth get-or-create client.nagios mon 'allow r' osd 'allow r' > /var/lib/nagios/ceph.client.nagios.keyring
Maybe thats the same problem as in issue #30 but worked in Luminous?