exasol / nagios-monitoring

Docker container with installed and configured Nagios software for EXASOL DB monitoring.
MIT License
10 stars 11 forks source link

check_db_diskspace: ZeroDivisionError #13

Closed rvolykh closed 5 years ago

rvolykh commented 6 years ago

Hello, thanks for the scripts they are vary helpful. But I stuck on error at check_db_diskspace.py script. It cannot calculate free space and finishes with ZeroDivisionError. Found that storage partition size is not calculated:

partitions = [{
    'next_fsck': 'Wed Nov 15 18:13:39 2017',
    'raid': 'disk-raid-none',
    'name': 'd00_os',
    'mount_count': '34/39',
    'devices': 'Default',
    'state': 'None',
    'free': '42.4',
    'encr': 'disk-encr-aes256',
    'type': 'disk-type-os',
    'size': '50'
}, {
    'next_fsck': '-',
    'raid': 'disk-raid-none',
    'name': 'd01_swap',
    'mount_count': '-',
    'devices': 'Default',
    'state': 'None',
    'free': '8.0',
    'encr': 'disk-encr-aes256',
    'type': 'disk-type-swap',
    'size': '16'
}, {
    'next_fsck': 'Wed Nov 15 18:13:39 2017',
    'raid': 'disk-raid-none',
    'name': 'd02_data',
    'mount_count': '34/39',
    'devices': 'Default',
    'state': 'None',
    'free': '42.4',
    'encr': 'disk-encr-aes256',
    'type': 'disk-type-data',
    'size': '50'
}, {
    'next_fsck': '-',
    'raid': 'disk-raid-none',
    'name': 'd03_storage',
    'mount_count': '-',
    'devices': ['/dev/sdb'],
    'state': 'None',
    'free': '0.0',
    'encr': 'disk-encr-aes256',
    'type': 'disk-type-storage',
    'size': '0'
}]

EXAStorage:

  Name Master Size Redun- Type Disk Labels Prio Recovery State
  v0000 1 8 GiB 1 Data d03_storage EXASolution_persistent 10 n/a ONLINE
  v0001 1 1 GiB 1 Data d03_storage EXASolution_temporary 10 n/a ONLINE

Space on Disks:

Name d03_storage Sum State Disks
n0010 50 GiB (40 GiB free) 49 GiB ( 40 GiB free) ONLINE [U]

Software Name: EXASolution 6.0.1

florian-reck commented 6 years ago

Hi,

To reproduce this issue I need to know, how many disk drives have been used on your installation and which disks has been used for EXAStorage.

rvolykh commented 6 years ago

We run AWS image with 3 disks (sda sdb sdc) From cluster information:

"partitions": [
   "major minor  #blocks  name", 
   "", 
   "202        0  209715200 xvda", 
   "202        1     256000 xvda1", 
   "202        2   52428800 xvda2", 
   "202        3    8388608 xvda3", 
   "202        4  148640768 xvda4"
  ]

Probably, problem is not with the script, because in https://exasol volumes are not reflected.. Maybe, some ideas what is wrong? Because, I stored several large files and nothing changed in ExaOperations, while my data is available in tables

florian-reck commented 6 years ago

Hi,

The block devices used by EXAStorage are not directly mapped or mounted into the operation system. They are managed by EXAStorage only. That means if you create large files on the file system you won't notice them by using the check_db_diskspace check.

But the interesting part for the monitoring is to find out what the reason is that the "d03_storage" partition does not provide a size (0 instead).

Can you please post some further informations to make it easier to us to reproduce this case? We need to know what EXAoperation role is set to your monitoring user.

Thanks in advance, Flo

rvolykh commented 6 years ago

Hello,

I created user as described in https://github.com/EXASOL/nagios-monitoring#creating-the-user, but also I'd tried with admin user - same result.

If you need more information, please, specify what can be also interesting I'll try to get it.

florian-reck commented 5 years ago

Can you please check again with this new version? We did many changes in the project so I hope one of these changes should fix this issue ;)