Open tgriep opened 3 years ago
Can you provide details about the affected partition? For example the output of the following df commands:
[root@centos7-01 src]# df -ih /boot/
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/sda1 512K 355 512K 1% /boot
[root@centos7-01 src]# df -ahHT /boot/
Filesystem Type Size Used Avail Use% Mounted on
/dev/sda1 xfs 1.1G 322M 742M 31% /boot
I asked the customer to run those commands. As soon as I get the data, I'll update it here.
Here is the data. df -ih /boot/ Filesystem Inodes IUsed IFree IUse% Mounted on /dev/sda1 398K 349 398K 1% /boot
df -ahHT /boot/ Filesystem Type Size Used Avail Use% Mounted on /dev/sda1 xfs 411M 273M 139M 67% /boot
I can reproduce this issue on my testserver.
On my server the /proc/fs/nfsd
filesystem is causing this issue.
This filesystem has no inodes:
[root@cl01 build]# df -i /proc/fs/nfsd
Filesystem Inodes IUsed IFree IUse% Mounted on
nfsd 0 0 0 - /proc/fs/nfsd
As workaround i have added the nfsd
filesystem to the exclude_fs_types
in the ncpa.cfg
and have restarted the ncpa_listener
service.
Can you please check if you have a filesystem which has no inodes and is not listed in the exclude_fs_types
directive of the ncpa.cfg
?:
[root@cl01 build]# df -ia
Filesystem Inodes IUsed IFree IUse% Mounted on
sysfs 0 0 0 - /sys
proc 0 0 0 - /proc
devtmpfs 232288 378 231910 1% /dev
securityfs 0 0 0 - /sys/kernel/security
tmpfs 235239 1 235238 1% /dev/shm
devpts 0 0 0 - /dev/pts
tmpfs 235239 515 234724 1% /run
tmpfs 235239 16 235223 1% /sys/fs/cgroup
cgroup 0 0 0 - /sys/fs/cgroup/systemd
pstore 0 0 0 - /sys/fs/pstore
cgroup 0 0 0 - /sys/fs/cgroup/perf_event
cgroup 0 0 0 - /sys/fs/cgroup/cpu,cpuacct
cgroup 0 0 0 - /sys/fs/cgroup/devices
cgroup 0 0 0 - /sys/fs/cgroup/net_cls,net_prio
cgroup 0 0 0 - /sys/fs/cgroup/memory
cgroup 0 0 0 - /sys/fs/cgroup/freezer
cgroup 0 0 0 - /sys/fs/cgroup/blkio
cgroup 0 0 0 - /sys/fs/cgroup/hugetlb
cgroup 0 0 0 - /sys/fs/cgroup/pids
cgroup 0 0 0 - /sys/fs/cgroup/cpuset
configfs 0 0 0 - /sys/kernel/config
/dev/mapper/centos_centos7--01-root 8910848 198245 8712603 3% /
selinuxfs 0 0 0 - /sys/fs/selinux
systemd-1 - - - - /proc/sys/fs/binfmt_misc
mqueue 0 0 0 - /dev/mqueue
hugetlbfs 0 0 0 - /dev/hugepages
debugfs 0 0 0 - /sys/kernel/debug
nfsd 0 0 0 - /proc/fs/nfsd
/dev/sda1 524288 356 523932 1% /boot
sunrpc 0 0 0 - /var/lib/nfs/rpc_pipefs
binfmt_misc 0 0 0 - /proc/sys/fs/binfmt_misc
tmpfs 235239 1 235238 1% /run/user/0
The customer has a partition that has the inodes full so that is the issue. Filesystem Inodes IUsed IFree IUse% Mounted on nfs:/unix 262615775312 262506836208 108939104 100% /data
They cannot exclude the file system types as they are checking other partitions of the same type.
Thank you for providing these helpful infos. I will investigate further and get back to you.
I have no idea, whats going wrong here.
How often does the exception occure?
Does any check on the disk
node work, if the exception occures?
I can only tell you, that the fix in #775 definitely fixes the issue on my testserver. Which operating system and version is in use? Maybe i can provide you an unofficial build including the fix if you like?
The operating they are running is the following. Red Hat Enterprise Linux Server release 7.6 (Maipo)
The ERROR float division by zero error in the ncpa_listener.log file happens on every disk call in the system and the error for the /boot partition is constant.
But as far as I know, all of the disk checks are failing.
If you can provide an unofficial build, I can send that to the user and have them test it.
RHEL 7 sounds great, because i work on such a system and can provide you the unofficial build. You can find the build here I hope, that the testbuild fixes the issue. Please let us know, if it does or not.
Please let me know, when you have downloaded the testbuild, because i want to remove it as soon as possible. Thank you.
I do not have access to the users system to install the updated package but I will let them know that about the updated RPM. I'll update the issue on the results of the new package.
When trying to monitor the /boot partition of a Linux system, this error is generated in the ncpa_listener.log file 2021-05-12 19:40:21,714 521180 ERROR float division by zero Traceback (most recent call last): File "/root/ncpa/agent/listener/psapi.py", line 257, in get_root_node File "/root/ncpa/agent/listener/psapi.py", line 211, in get_disk_node File "/root/ncpa/agent/listener/psapi.py", line 64, in make_mountpoint_nodes ZeroDivisionError: float division by zero
And the API disk check for that partition displays this error. { "error": { "node": "disk", "path": "/api/disk", "message": "The node requested does not exist.", "code": 100 } }
In the NCPA agent, this following line is what is generating the error. https://github.com/NagiosEnterprises/ncpa/blob/master/agent/listener/psapi.py#L64
iu = st.f_files - st.f_ffree this line should make iu = 0 or < 0 if the st.f_files is 0, and then this line should not run if that's the case: if iu > 0: iup = math.ceil(100 * float(iu) / float(st.f_files))
but somehow it is running anyway and is causing the errors.
See this ticket. https://support.nagios.com/tickets/scp/tickets.php?id=14242