m-erhardt / check-container-stats

Icinga / Nagios plugins to check metrics of Docker / PodMan containers
GNU General Public License v3.0
9 stars 5 forks source link

Support rootless containers #7

Open leeclemens opened 1 year ago

leeclemens commented 1 year ago

I tried a few ways using sudo and su, but it seems the cleanest way to support rootless containers is by changing the podman ps and podman stats commands directly in this check script.

josy1024 commented 1 year ago

this script works running as root but not in an nrpe task ;-(

trying to monitor a rootless container (mysql-user) with nrpe.

I'v tried variants on nrpe commands:

# your code with `su`
command[check_container_mysql-mysqldb0]=
su - mysql-user ${checkpluginsdir}/check_container_stats_podman.py -c mysqldb0

# leeclemens code:
${checkpluginsdir}/check_container_stats_podman.py -c mysqldb0 -u mysql-user

sudoers:

Cmnd_Alias        ICIPODMAN = /usr/lib64/nagios/plugins/check_container_stats_podman.py
nrpe            ALL=(ALL) NOPASSWD: ICIPODMAN
nrpe            ALL=(ALL) NOPASSWD: /usr/bin/podman
nrpe            ALL=(ALL) NOPASSWD: /usr/bin/systemd-run
Defaults!ICIPODMAN !requiretty

i look forward to a few flashes of inspiration! @leeclemens @m-erhardt

ERROR MESSAGE

"UNKNOWN - docker stats command returned error: b'Failed to connect to bus: Permission denied Failed to start transient service unit: Transport endpoint is not connected"

m-erhardt commented 1 year ago

@josy1024

I havent been able to get @leeclemens solution to work as well.

What has worked for me in order to check rootless podman containers is to run sudo with the -i flag. This spawns a login shell as the specified user and executes the plugin within that user shell.

Apparently the podman binary requires certain environment variables ($XDG_RUNTIME_DIR?) and tries to perform a chdir to $HOME.

NRPE checkcommand:

/usr/bin/sudo -n -i -u podman /usr/lib64/nagios/plugins/check_container_stats_podman.py -c testcontainer

Sudoers file: Dont forget to change NOPASSWD:/bin/bash if the default shell for your user is something other than /bin/bash

nrpe  ALL=(podman) NOPASSWD:/bin/bash -c /usr/lib64/nagios/plugins/check_container_stats_podman.py *
josy1024 commented 1 year ago

thanks!!!

I've getting check_nrpe script working from icinga host, and it looks i have an podman issue! container listed as "created" but container and services are running (confused)

podman ps --all shows

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 2d9e9d6fd8f1 docker.io/library/mysql:8.0 mysqld 18 hours ago Created 0.0.0.0:3306->3306/tcp mysqldb0

from icinga:

/etc/icinga2/plugins/check_nrpe -H hostname -c check_container_mysqldb0 CRITICAL - Container p2p-mysqldb0 is Created

m-erhardt commented 1 year ago

@josy1024 : If you've verified that the container is definitely running there might be a breaking change in the podman output.

Can you execute the following commands for me (as the user running the container):

[]# podman ps -a -f name=^mysqldb0$ --format "{{.Names}},{{.Status}},{{.Size}},{{.RunningFor}}" --size
mysqldb0,Up 2 seconds,0B (virtual 7.05MB),2 seconds ago

[]# podman --version
podman version 4.4.1

The first command is the one the plugin uses to determine the container state.

josy1024 commented 1 year ago

it seems haveing an old version of podman. -... i'll give an update when 1 have 4.4!

podman ps -a -f name=^mysqldb0$ --format "{{.Names}},{{.Status}},{{.Size}},{{.RunningFor}}" --size mysqldb0,Up 16 minutes ago,18B (virtual 565MB),3 hours ago

[user@vmxxx overlay-containers]$ podman --version podman version 4.2.0

m-erhardt commented 1 year ago

It seems that your container wasn't running when you executed podman ps --all the last time and there is NO breaking change in the podman-output 😉.

The plugin expects the 2nd field (Status) of the command to show Up <duration> which is also the case with your version. The last time you ran podman ps --all the Status-field showed Created instead of Up

josy1024 commented 1 year ago

update - podman ici stats working!

i'm reverting back to "root" container and hat to play around with sudoers. nrpe.cfg:

/usr/bin/sudo -n -i /usr/lib64/nagios/plugins/check_container_stats_podman_root.py -c mysqldb0
/usr/bin/sudo -n -i -u rootlessuser /usr/lib64/nagios/plugins/check_container_stats_podman_root.py -c mysqldb0

nrpe needed sudoers rights for accessing other users podman containers.

nrpe            ALL=(ALL) NOPASSWD: /usr/bin/podman
nrpe            ALL=(ALL) NOPASSWD: /usr/bin/systemd-run
nrpe            ALL=(ALL) NOPASSWD:/bin/bash -c /usr/lib64/nagios/plugins/check_container_stats_podman_root.py *

💡 the podman containes are not found, because nrpe has no containers!!

IF: NRPE: Unable to read output => check sudoers IF: Container not Found => check commands and add sudo IF: CRITICAL - Container mysqldb0 is Created => container service file has different id's than the running container. (=PODMAN ISSUE)

i'v used mix from sudo and this file for the check: (check_container_stats_podman_root.py) = https://github.com/m-erhardt/check-container-stats/blob/master/check_container_stats_podman.py